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ACTIVIN RECEPTOR- LIKE KINASES, PROTEINS HAVING 

SERINE THREONINE KINASE DOMAINS AND THEIR USE. 
************ 

Field of the Invention 

This invention relates to proteins having 
5 serine/threonine kinase domains, corresponding nucleic acid 
molecules, and their use. 
Background of the Invention 

The transforming growth factor-B (TGF-fl) superfamily 
consists of a family of structurally-related proteins, 

10 including three different mammalian isof orms of TGF-S (TGF- 
fll, £2 and B3 ) $ activins, inhibins, mtillerian-inhibiting 
substance and bone morphogenic proteins (BHPs) (for reviews 
see Roberts and Sporn, (1990) Peptide Growth Factors and 
Their Receptors, Pt.l, Sporn and Roberts, eds. (Berlin: 

15 Springer - Verlag) pp 419-472; Hoses et fii (1990) Cell £2, 
245-247) . The proteins of the TGF-B superfamily have a 
wide variety of biological activities. TGF-B acts as a 
growth inhibitor for many cell types and appears to play a 
central role in the regulation of embryonic development, 

20 tissue regeneration, immuno-regulation, as well as in 
fibrosis and carcinogenesis (Roberts and Sporn (199) see 
above) • 

Activins and inhibins were originally identified as 
factors which regulate secretion of follicle-stimulating 

25 hormone secretion (Vale g£ £l (1990) Peptide Growth Factors 
and Their Receptors, Pt.2, Sporn and Roberts, eds. (Berlin: 
Springer-Verlag) pp. 211-248) • Activins were also shown to 
induce the differentiation of haematopoietic progenitor 
cells (Murata fil (1988) Proc. Natl. Acad. Sci. USA ££, 

30 2434 - 2438; Eto e£ jal (1987) Biochem. Biophys. Res. 
Commun. 142 . 1095-1103) and induce mesoderm formation in 
Xenopus embryos (Smith s£ j&l (1990) Nature 345 , 729-731; 
van den Eijnden-Van Raaij £i (1990) Nature 2!£, 732- 
734). 

35 . BHPs or osteogenic proteins which induce the formation 

of bone and cartilage when implanted subcutaneously (Wozney 
St Si (1988) Science 242, 1528-1534), facilitate neuronal 
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differentiation (ParaDcar fit &1 (1992) J. Cell Biol. 119 . 
1721-1728) and induce monocyte chemotaxis (Cunningham et al 
(1992) Proc. Natl. Acad. Sci. USA ££, 11740-11744). 
KQllerian-inhibiting substance induces regression of the 

5 Mtillerian duct in the sale reproductive system (Gate ££ jjl 
(1986) Cell AS, 685-698), and a glial cell line-derived 
neurotrophic factor enhances survival of midbrain 
dopaminergic neurons (Lin fi£ fil (1993) Science 260 , 1130- 
1132)* The action of these growth factors is mediated 

0 through binding to specific cell surface receptors. 

Within this family, TGF-B receptors have been most 
thoroughly characterized • By covalently cross -linking 
radio-labelled TGF-B to cell surface molecules followed by 
polyacrylamide gel electrophoresis of the affinity-labelled 

5 complexes, three distinct size classes of cell surface 
proteins (in most cases) have been identified, denoted 
receptor type I (53 kd) , type II (75 kd) , type III or 
betaglycan (a 300 kd proteoglycan with a 120 kd core 
protein) (for a review see Massague (1992) Cell £2 1067- 

0 1070) and more recently endoglin (a homodimer of two 95 kd 
subunits) (Cheifetz fi£ £i (1992) J. Biol. Chem. 267 19027- 
19030) . Current evidence suggests that type I and type II 
receptors are directly involved in receptor signal 
transduction (Seqrarini et ft I (1989) Kol, Endo., 2, 261-272; 

5 Laiho jlL (1991) J. Biol. Chem. 266 . 9100-9112) and may 
form a heteromeric complex; the type II receptor is needed 
for the binding of TGF-B to the type I receptor and the 
type I receptor is needed for the signal transduction 
induced by the type II receptor (Wrana e£ £l (1992) Cell, 

0 21, 1003-1004). The type III receptor and endoglin may 
have more indirect roles, possibly by facilitating the 
binding of ligand to type II receptors (Wang e£ £l (1991) 
Cell, §1 797-805; L6pez-Casillas £l (1993) Cell, 21 
1435-1444). 

5 Binding analyses with activin A and BMP4 have led to 

the identification of two co-existing cross-linked affinity 
complexes of 50-60 kDa and 70-80 kDa on responsive cells 



WO" 94/1 1502 



PCT/GB93/02367 



3 

(Hino fit Al (1989) J* Bid. Chem. 10309 - 10314; 

Mathews and Vale (1991), Cell f£# 775-785; Paralker fit fil 

(1991) Proc. Natl. Acad* Sci. USA £2, 8913-8917). By 
analogy with TGF-B receptors they are thought to be 

5 signalling receptors and have been named type I and type II 
receptors. 

Among the type II receptors for the TGF-B super family 
of proteins, the cDNA for the activin type II receptor (Act 
RII) was the first to be cloned (Mathews and Vale (1991) 

10 Cell ££, 973-932). The predicted structure of the receptor 
was shown to be a transmembrane protein with an 
intracellular serine/ threonine kinase domain. The activin 
receptor is related to the C. eleaans daf -1 gene product, 
but the ligand is currently unknown (Georgi fit si (1990) 

15 Cell H, 635-645). Thereafter, another form of the activin 
type II receptor (activin type IIB receptor) , of which 
there are different splicing variants (Mathews fit fil 

(1992) , Science 221, 1702-1705; Attisano fit fil (1992) Cell 
97-108) , and the TGF-B type II receptor (TBRII) (Lin fit 

20 fil (1992) Cell fi£, 775-785) were cloned, both of which have 
putative serine/threGnine kinase domains. 
Summary of the Invention 

The present invention involves the discovery of 
related novel peptides, including peptides having the 

25 activity of those defined herein as SEQ ID Nos. 2, 4, 8, 
10, 12, 14, 16 and 18. Their discovery is based on the 
realisation that receptor serine/threonine kinases form a 
new receptor family, which may include the type II 
receptors for other proteins in the TGF-B super family. To 

30 ascertain whether there were other members of this family 
of receptors, a protocol was designed to clone ActRII/ daf 
I related cDNAs. This approach made use of the polymerase 
chain reaction (PCR) , using degenerate primers based upon 
the amino-acid seguence similarity between kinase domains 

35 of the mouse activin type II receptor and daf-I gene 
products • 
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This strategy resulted in the isolation of a new 
family of receptor kinases called fcctivin receptor like 
Kinases (ALK's) 1-6. These cDNAs shoved an overall 33-39* 
sequence similarity with ActRII and TGF-B type II receptor 
5 and 40-92% sequence similarity towards each other in the 
kinase domains. 

Soluble receptors according to the invention comprise 
at least predominantly the extracellular domain. These can 
be selected from the information provided herein, prepared 
10 in conventional manner , and used in any manner associated 
with the invention. 

Antibodies to the peptides described herein may be 
raised in conventional manner. By selecting unique 
sequences of the peptides, antibodies having desired 
15 specificity can be obtained. 

The antibodies may be monoclonal, prepared in known 
manner. In particular, monoclonal antibodies to the 
extracellular domain are of potential value in therapy. 

Products of the invention are useful in diagnostic 
20 methods, e.g. to determine the presence in a sample for an 
analyte binding therewith, such as in an antagonist assay. 
Conventional techniques, e.g. an enzyme-linked 
immunosorbent assay, may be used. 

Products of the invention having a specific receptor 
25 activity can be used in therapy, e.g. to modulate 
conditions associated with activin or TGF-0 activity. Such 
conditions include fibrosis, e.g. liver cirrhosis and 
pulmonary fibrosis, cancer, rheumatoid arthritis and 
glomeronephr itis • 
30 prief Desc ription of the Drawings 

Figure 1 shows the alignment of the serine/threonine 
(S/T) kinase domains (I-VIII) of related receptors from 
transmembrane proteins, including embodiments of the 
present invention. The nomenclature of the subdomains is 
35 accordingly to Hanks et al (1988). 

Figures 2A to 2D shows the sequences and 
- characteristics of the respective primers used in the 
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initial PCR reactions. The nucleic acid sequences are also 
given as SEQ ID Hos. 19 to 22. 

Figure 3 is a comparison of the amino-acid sequences 
of human activin type II receptor (Act R-II) , mouse activin 
5 type IIB receptor (Act R-IIB) , human TGF-B type II receptor 
(TBR-II), human TGF-B type I receptor (ALK-5) , human 
activin receptor type IA (ALK-2) , and type IB (ALK-4) , ALKs 
1 t 3 and mouse ALK-6 . 

Figure 4 shows, schematically, the structures for fiafc- 
10 1, Act R-II, Act R-IIB, TBR-II , TBR-I/ALK-5, ALK'S -1, -2 
(Act RIA), -3, -4 (Act RIB) 6-6. 

Figure 5 shows the sequence alignment of the cysteine- 
rich domains of the ALKs, TBR-II, Act R-II, Act R-IIB and 

daf-1 receptors. 

Figure 6 is a comparison of kinase domains of 
serine/ threonine kinases, showing the percentage amino-acid 
identity of the kinase domains. 

Figure 7 shows the pairwise alignment relationship 
between the kinase domains of the receptor serine/threonine 
20 kinases. The dendrogram was generated using the Jotun-Hein 
alignment program (Hein (1990) Meth. Enzymol. 131, 626- 
645). 

fr^f Description of the Seq uence Listings 

Sequences 1 and 2 are the nucleotide and deduced 
25 amino-acid sequences of cDNA for hALK-1 (clone HP57) . 

Sequences 3 and 4 are the nucleotide and deduced 
amino-acid sequences of cDNA for hALK-2 (clone HP53). 

Sequences 5 and 6 are the nucleotide and deduced 
amino-acid sequences of cDNA for hALK-3 (clone ONF5) . 
30 sequences 7 and 8 the nucleotide and deduced amino- 

acid sequences of cDNA for hALK-4 (clone 11H8), 
complemented with PCR product encoding extracellular 
domain. 

Sequences 9 and 10 are the nucleotide and deduced 
35 amino-acid sequences of cDNA for hALK-5 (clone EMBLA) . 

Sequences 11 and 12 are the nucleotide and deduced 
amino-acid sequences of cDNA for mALK-1 (clone AM6) . 
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Sequences 13 and 14 are the nucleotide and deduced 
amino-acid sequences of cDNA for mALK-3 (clones KE-7 and 
KE-D) . 

Sequences 15 and 16 are the nucleotide and deduced 
5 amino-acid sequences of cDNA for mALK-4 (clone Sal) . 

Sequences 17 and 18 are the nucleotide and deduced 
amino-acid sequences of cDNA for mALK-6 (clone ME-6) . 

Sequence 19 (Bl-S) is a sense primer, extracellular 
domain, cysteine-rich region, BamHI site at 5' end, 28-mer, 
10 64-fold degeneracy. 

Sequence 20 (B3-S) is a sense primer, kinase domain 
II, BamHI site at 5' end, 25-mer, 162-fold degeneracy* 

Sequence 21 (B7-S) is a sense primer, kinase domain 
VIB, S/T kinase specific residues, BamHI site at 5' end, 
15 24-mer, 288-fold degeneracy. 

Sequence 22 (E8-AS) is an anti-sense primer, kinase 
domain, S/T kinase-specif ic residues EcoRI site at 5' end, 
20-mer, 18-fold degeneracy. 

Sequence 23 is an oligonucleotide probe. 
20 Sequence 24 is a 5' primer. 

Sequence 25 is a 3' primer. 

Sequence 26 is a consensus sequence in Subdomain I. 
Sequences 27 and 28 are novel sequence motifs in 
Subdomain VIB. 

25 Sequence 29 is a novel sequence motif in Subdomain 

VIII. 

pescription of the Invention 

As described in more detail below, nucleic acid 
sequences have been isolated, coding for a new sub-family 

30 of serine/ threonine receptor kinases. The term nucleic 
acid molecules as used herein refers to any sequence which 
codes for the murine, human or mammalian form, amino-acid 
sequences of which are presented herein* It is understood 
that the well known phenomenon of codon degeneracy provides 

35 for a great deal of sequence variation and all such 
varieties are included within the scope of this invention. 
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The nucleic acid sequences described herein may be 
used to clone the respective genomic DNA sequences in order 
to study the genes' structure and regulation. The murine 
and human cDNA or genomic sequences can also be used to 
5 isolate the homologous genes from other mammalian species* 
The mammalian DNA sequences can be used to study the 
receptors' functions in various in vitro and in viv9 model 
systems. 

As exemplified below for ALK-5 cDNA, it is also 

10 recognised that, given the sequence information provided 
herein, the artisan could easily combine the molecules with 
a pertinent promoter in a vector, so as to produce a 
cloning vehicle for expression of the molecule. The 
promoter and coding molecule must be operably linked via 

15 any of the well-recognized and easily-practised 
methodologies for so doing. The resulting vectors, as well 
as the isolated nucleic acid molecules themselves, may be 
used to transform prokaryotic cells (e.g. £. coli ) , or 
transfect eukaryotes such as yeast (£. cerevisiae l r pae, 

20 COS or CHO cell lines. Other appropriate expression 
systems will also be apparent to the skilled artisan. 

Several methods may be used to isolate the ligands for 
the ALKs. As shown for ALK-5 cDNA, cDNA clones encoding 
the active open reading frames can be subcloned into 

25 expression vectors and transfected into eukaryotic cells, 
for example COS cells. The transfected cells which can 
express the receptor can be subjected to binding assays for 
radioactively-labelled members of the TGF-B superfamily 
(TGF-B, activins, inhibins, bone morphogenic proteins and 

30 miillerian-inhibiting substances) , as it may be expected 
that the receptors will bind members of the TGF-B 
superfamily. Various biochemical or cell-based assays can 
be designed to identify the ligands, in tissue extracts or 
conditioned media, for receptors in which a ligand is not 

35 known. Antibodies raised to the receptors may also be used 
to identify the ligands, using the immunoprecipitation of 

I the cross-linked complexes. Alternatively, purified 
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receptor could be used to isolate the ligands using an 
affinity-based approach. The determination of the 
expression patterns of the receptors may also aid in the 
isolation of the ligand. These studies may be carried out 
5 using ALK DNA or RNA sequences as probes to perf ona in situ 
hybridisation studies. 

The use of various model systems or structural studies 
should enable the rational development of specific agonists 
and antagonists useful in regulating receptor function. It 

10 may be envisaged that these can be peptides, nutated 
ligands, antibodies or other molecules able to interact 
with the receptors. 

The foregoing provides examples of the invention 
Applicants intend to claim which includes, inter alia, 

15 isolated nucleic acid molecules coding for activin 
receptor-like kinases (ALKs) , as defined herein. These 
include such sequences isolated from mammalian species such 
as mouse, human, rat, rabbit and monkey. 

The following description relates to specific 

20 embodiments. It will be understood that the specification 
and examples are illustrative but not limitative of the 
present invention and that other embodiments within the 
spirit and scope of the invention will suggest themselves 
to those skilled in the art. 

25 Preparation of mRNA and Construction of a cDNA Library 

For construction of a cDNA library, poly (A)* RNA was 
isolated from a human erythroleukemia cell line (EEL 
92.1.7) obtained from the American Type Culture Collection 
(ATCC TIB 180) . These cells were chosen as they have been 

30 shown to respond to both activin and TGF-B. Moreover 
leukaemic cells have proved to be rich sources for the 
cloning of novel receptor tyrosine kinases (Partanen £t fil 
(1990) Proc. Natl. Acad. Sci. USA SI, 8913-8917 and (1992) 
Mol. Cell. Biol. 12, 1698-1707). (Total) RNA was prepared 

35 by the guanidinium isothiocyanate method (Chirgwin £t £l 
(1979) Biochemistry JL£r 5294-5299). mRNA was selected 
using the poly-A or poly AT tract mRNA isolation kit 
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(Promega, Madison, Wisconsin, U.S.A.) as described by the 
manufacturers, or purified through an oligo (dT) -cellulose 
column as described by Aviv and Leder (1972) Proc. Natl. 
Acad. Sci. USA £2, 1408-1412. The isolated mSNA was used 
5 for the synthesis of random primed (Aaersham) cDNA, that 
was used to make a JlgtlO library with 1x10 s independent 
cDNA ciones using the Riboclone cDNA synthesis system 
(Promega) and AgtlO in vitro packaging kit (Amersham) 
according to the manufacturers' procedures. An amplified 

10 oligo (dT) primed human placenta AZAPII cDNA library of 
5xl0 5 independent clones was used. Poly (A)* RNA isolated 
from AG1518 human foreskin fibroblasts was used to prepare 
a primary random primed AZAPII cDNA library of l.SxlO 6 
independent clones using the RiboClone cDNA synthesis 

15 system and Gigapack Gold II packaging extract (Stratagene) . 
In addition, a primary oligo (dT) primed human foreskin 
fibroblast JlgtiO cDNA library (Claesson-Welsh e£ £i (1989) 
Proc. Natl. Acad. Sci. USA. $& 4917-4912) was prepared. An 
amplified oligo (dT) primed EEL cell Jtgtll cDNA library of 

20 1.5 X 10 6 independent clones (Poncz £l! (1987) Blood £2 
219-223) was used. A twelve-day mouse embryo AE XIox cDNA 
library was obtained from Novagen (Madison, Wisconsin, 
U.S.A.); a mouse placenta AZAPII cDNA library was also 
used. 

25 Generation of cDNA Probes by PCft 

For the generation of cDNA probes by PGR (Lee si 
(1988) Science 239 . 1288-1291) degenerate PCR primers were 
constructed based upon the amino-acid sequence similarity 
between the mouse activin type II receptor (Mathews and 

30 Vale (1991) Cell ££, 973-982) and daf-1 (George ejfe fil 
(1990) Cell £A, 635-645) in the kinase domains II and VIII. 
Figure 1 shows the aligned serine/threonine kinase domains 
(I-VIII), of four related receptors of the TGF-fl 
superfamily, i.e. hTBR-II, mActR-IIB, mActR-II and the daf - 

35 1 gene product, using the nomenclature of the subdomains 
according to Hanks £l (1988) Science 2£1, 45-52. 
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Several considerations were applied in the design of 
the PCR primers. The sequences were taken from regions of 
homology between the activin type II receptor and the jja£-l 
gene product , with particular emphasis on residues that 

5 confer serine/threonine Bpecificity (see Table 2) and on 
residues that are shared by transmembrane kinase proteins 
and not by cytoplasmic kinases. The primers were designed 
so that each primer of a PCR set had an approximately 
similar GC composition, and so that self complementarity 

.0 and complementarity between the 3' ends of the primer sets 
were avoided. Degeneracy of the primers was kept as low as 
possible, in particular avoiding serine, leucine and 
arginine residues (6 possible codons) , and human codon 
preference was applied. Degeneracy was particularly 

L5 avoided at the 3' end as, unlike the 5' end, where 
mismatches are tolerated, mismatches at the 3' end 
dramatically reduce the efficiency of PCR. 

In order to facilitate directional subcloning, 
restriction enzyme sites were included at the 5' end of the 

20 primers, with a GC clamp, which permits efficient 
restriction enzyme digestion. The primers utilised are 
shown in Figure 2. Oligonucleotides were synthesized using 
Gene assembler plus (Pharmacia - LKB) according to the 
manufacturers instructions. 

25 ' The mRNA prepared from HEL cells as described above 

was reverse-transcribed into cDNA in the presence of 50 mM 
Tris-HCl, pH 8.3, 8 mM MgCl 2 , 30 mH KCl, 10 mM 
dithiothreitol, 2mM nucleotide triphosphates, excess oligo 
(dT) primers and 34 units of AMV reverse transcriptase at 

30 42°C for 2 hours in 40 til of reaction volume. 
Amplification by PCR was carried out with a 7.5% aliquot (3 
pi) of the reverse-transcribed mRNA, in the presence of 10 
mM Tris-HCl, pH 8.3, 50 mH KCl, 1.5 K MgCl 2 , 0.01% gelatin, 
0.2 mM nucleotide triphosphates, 1 iM of both sense and 

35 antisense primers and 2.5 units of Tag polymerase (Perkin 
Elmer Cetus) in 100 pi reaction volume. Amplifications 

l_ vere performed on a thermal cycler (Perkin Elmer Cetus) 
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using the following program: first 5 thermal cycles with 
denaturation for 1 minute at 94°C, annealing for 1 minute 
at 50°C, a 2 minute ramp to 55°C and elongation for 1 minute 
at 72°C, followed by 20 cycles of 1 minute at 94°C, 30 
5 seconds at 55°C and 1 minute at 72°C. A second round of PCR 
was performed with 2 /xl of the first reaction as a 
template. This involved 25 thermal cycles, each composed 
of 94°C (1 min), 55°C (0.5 min) , 72°C (l min) . 

General procedures such as purification of nucleic 

10 acids, restriction enzyme digestion, gel electrophoresis, 
transfer of nucleic acid to solid supports and subcloning 
were performed essentially according to established 
procedures as described by Sambrook g£ fil, (1989) , 
Molecular cloning: A Laboratory Manual, 2 nd Ed. Cold Spring 

15 Harbor Laboratory (Cold Spring Harbor, New York, USA). 

Samples of the PCR products were digested with BamH I 
and Eco RI and subsequently fractionated by low melting 
point agarose gel electrophoresis. Bands corresponding to 
the approximate expected sizes, (see Table l: *460 bp for 

20 primer pair B3-S and E8-AS and » 140 bp for primer pair B7- 
S and E8-AS) were excised from the gel and the DNA was 
purified. Subsequently, these fragments were ligated into 
pUC19 (Yanisch-Perron £t fil (1985) Gene 22, 103-119), which 
had been previously linearised with BamH I and EcoR l and 

25 transformed into £. coli strain DH5a using standard 
protocols (Sambrook al, supra ) . Individual clones were 
sequenced using standard double-stranded sequencing 
techniques and the dideoxynucleotide chain termination 
method as described by Sanger fil (1977) Proc. Natl. 

30 Acad. Sci. USA 2A$ 5463-5467, and T7 DNA polymerase. 

Employing Reverse Transcriptase PCR on HEL mRNA with 
the primer pair B3-S and E8-AS, three PCR products were 
obtained, termed 11.1, 11.2 and 11.3, that corresponded to 
novel genes. Using the primer pair B7-S and E8-AS, an 

35 additional novel PCR product was obtained termed 5.2. 
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15 isolation of cDWA Clones 

The PCR products obtained vere used to screen various 
cDNA libraries described supra . Labelling of the inserts 
of PCR products was performed using random priming method 
(Feinberg and Vogelstein (1983) Anal. Biochem, 132 6-13) 

20 using the Megaprime DKA labelling system (Amersham) . The 
oligonucleotide derived from the sequence of the PCR 
product 5.2 was labelled by phosphorylation with T4 
polynucleotide kinase following standard protocols 
(Sambrook efc al, supra ) . Hybridization and purification of 

25 positive bacteriophages vere performed using standard 
molecular biological techniques. 

The double-stranded DKA clones vere all sequenced 
using the dideoxynucleotide chain-termination method as 
described by Sanger al, £HEEa# using T7 DKA polymerase 

30 (Pharmacia - 1KB) or Sequenase (U.S. Biochemical 
Corporation, Cleveland, Ohio, U.S.A.). Compressions of 
nucleotides vere resolved using 7-deaza-GTP (U.S. 
Biochemical Corp.) DKA sequences vere analyzed using the 
DKA STAR computer program (DKA STAR Ltd. U.K.). Analyses 

4s of the sequences obtained revealed the existence of six 
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distinct putative receptor serine/threonine kinases which 
have been named ALK 1-6. 

To clone cDNA for ALK-1 the oligo (dT) primed human 
placenta cDNA library vas screened with a radiolabeled 
5 insert derived from the PCR product 11.3; based upon their 
restriction enzyme digestion patterns, three different 
types of clones with approximate insert sizes, of 1.7 kb, 
2 kb & 3.5 kb were identified. The 2 kb clone, named 
HP57, was chosen as representative of this class and 

10 subjected to complete sequencing* Sequence analysis of 
ALK-1 revealed a sequence of 1984 nucleotides including a 
poly-A tail (SEQ ID No. 1) . The longest open reading frame 
encodes a protein of 503 amino-acids, with high sequence 
similarity to receptor serine/threonine kinases (see 

15 below) . The first methionine codon, the putative 
translation start site, is at nucleotide 283-285 and is 
preceded by an in-frame stop codon. This first ATG is in 
a more favourable context for translation initiation (Kozak 
(1987) Nucl. Acids Res., 1£, 8125-8148) than the second and 

20 third in-frame ATG at nucleotides 316-318 and 325-327. The 
putative initiation codon is preceded by a 5' untranslated 
sequence of 282 nucleotides that is GC-rich (80* GC) , which 
is not uncommon for growth factor receptors (Kozak (1991) 
J. Cell Biol., US, 887-903). The 3' untranslated sequence 

25 comprises 193 nucleotides and ends with a poly-A tail. No 
)bona fide poly-A addition signal is found, but there is a 
sequence (AATACA) , 17-22 nucleotides upstream of the poly-A 
tail, which may serve as a poly-A addition signal. 

ALK-2 cDNA was cloned by screening an amplified oligo 

30 (dT) primed human placenta cDNA library with a 
radiolabeled insert derived from the PCR product 11.2. 
Two clones, termed EP53 and HP64, with insert sizes of 2*7 
kb and 2.4 kb respectively, were identified and their 
sequences were determined. No sequence difference in the 

35 overlapping clones was found, suggesting they are both 
derived from transcripts of the same gene. 
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Sequence analysis of cDNA clone HP53 (SEQ ID No. 3) 
revealed a sequence of 2719 nucleotides with a poly-A tail. 
The longest open reading frame encodes a protein of 509 
amino-acids. The first ATG at nucleotides 104-106 agrees 

5 favourably with Kozak's consensus sequence with an A at 
position 3. This ATG is preceded in-frame by a stop codon. 
There are four ATG codons in close proximity further 
downstream, which agree with the Kozak's consensus sequence 
(Kozak, supra ) , but according to KozaX's scanning model the 

L0 first ATG is predicted to be the translation start site. 
The 5' untranslated sequence is 103 nucleotides. The 3' 
untranslated sequence of 1089 nucleotides contains a 
polyadenylation signal located 9-14 nucleotides upstream 
from the poly-A tail. The cDNA clone HP64 lacks 498 

L5 nucleotides from the 5' end compared to HP53, but the 
sequence extended at the 3' end with 190 nucleotides and 
poly-A tail is absent. This suggests that different 
polyadenylation sites occur for ALK-2. In Northern blots, 
however, only one transcript was detected (see below). 

20 The cDNA for human ALK-3 was cloned by initially 

screening an oligo (dT) primed human foreskin fibroblast 
cDNA library with an oligonucleotide (SEQ ID No. 23) 
derived from the PCR product 5.2. One positive cDNA clone 
with an insert size of 3 kb, termed ON11, was identified. 

25 However, upon partial sequencing, it appeared that this 
clone was incomplete; it encodes only part of the kinase 
domain and lacks the extracelluar domain. The most 5' 
sequence of ON11, a 540 nucleotide Xfe&I restriction 
fragment encoding a truncated kinase domain, was 

30 subsequently used to probe a random primed fibroblast cDNA 
library from which one cDNA clone with an insert size of 3 
kb, termed ONF5, was isolated (SEQ ID No. 5). Sequence 
analysis of ONF5 revealed a sequence of 2932 nucleotides 
without a poly-A tail, suggesting that this clone was 

35 derived by internal priming. The longest open reading 
frame codes for a protein of 532 amino-acids. The first 
ATG codon which is compatible with Kozak's consensus 
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sequence (Kozak, supra ) , is at 310-312 nucleotides and is 
preceded by an in-frame stop codon. The 5' and 3' 
untranslated sequences are 309 and 1027 nucleotides long, 
respectively. 

5 ALK-4 cDNA was identified by screening a human oligo 

(dT) primed human erythroleukemia cDNA library with the 
radiolabeled insert of the PCR product ll.l as a probe. 
One cDNA clone , termed 11H8 / was identified with an insert 
size of 2 kb (SEQ ID No. 7) • An open reading frame vas 

10 found encoding a protein sequence of 383 amino-acids 
encoding a truncated extracellular domain with high 
similarity to receptor serine/threonine kinases. The 3' 
untranslated sequence is 818 nucleotides and does not 
contain a poly-A tail, suggesting that the cDNA vas 

15 internally primed. cDNA encoding the complete 

extracellular domain (nucleotides 1-366} vas obtained from 
HEL cells by RT-PCR vith 5' primer (SEQ ID No. 24) derived 
in part from sequence at translation start site of SKR-2 (a 
cDNA sequence deposited in GenBank data base, accesion 

20 number L10125, that is identical in part to ALK-4) and 3' 
primer (SEQ ID No. 25) derived from 11H8 cDNA clone. 

ALK-5 was identified by screening the random primed 
HEL cell Jtgt 10 cDNA library vith the PCR product 11. l as 
a probe. This yielded one positive clone termed EHBLA 

25 (insert size of 5.3 kb vith 2 internal JESfiRI sites). 
Nucleotide sequencing revealed an open reading frame of 
1509 bp, coding for 503 amino-acids. The open reading 
frame vas flanked by a 5' untranslated sequence of 76 bp, 
and a 3' untranslated sequence of 3.7 kb vhich vas not 

30 completely sequenced. The nucleotide and deduced amino- 
acid sequences of ALK-5 are shovn in SEQ ID Nos. 9 and 10. 
In the 5' part of the open reading frame, only one ATG 
codon" vas found; this codon fulfils the rules of 
translation initiation (Kozak, supra ) ♦ An in-frame stop 

35 codon vas found at nucleotides (-54) -(-52) in the 5' 
untranslated region. The predicted ATG start codon is 
followed by a stretch of hydrophobic amino-acid residues 
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which has characteristics of a cleavable signal sequence. 
Therefore f the first ATG codon is likely to be used as a 
translation initiation site. A preferred cleavage site for 
the signal peptidase, according to von Heijne (1986) Nucl. 
5 Acid* Res. 1£, 4683-4690, is located between amino-acid 
residues 24 and 25. The calculated molecular mass of the 
primary translated product of the ALK-5 without signal 
sequence is 53,646 Da. 

Screening of the mouse embryo 1EX log cDNA library 

10 using PCR, product 11.1 as a probe yielded 20 positive 
clones. DNAs from the positive clones obtained from this 
library were digested with £c£RI and HinfiHI, 
electrophoretically separated on a 1.3* agarose gel and 
transferred to nitrocellulose filters according to 

15 established procedures as described by Sambrook ££ aJL, 
supra . The filters were then hybridized with specific 
probes for human ALK-1 (nucleotide 288-670), ALK-2 
(nucleotide 1-581) , ALK-3 (nucleotide 79-824) or ALK-4 
nucleotide 1178-1967) . Such analyses revealed that a clone 

20 termed ME-7 hybridised with the human ALK-3 probe. 
However, nucleotide sequencing revealed that this clone was 
incomplete, and lacked the 5' part of the translated 
region. Screening the same cDNA library with a probe 
corresponding to the extracellu^r domain of human ALK-3 

25 (nucleotides 79-824) revealed the clone ME-D. This clone 
was isolated and the sequence was analyzed. Although this 
clone was incomplete in the 3' end of the translated 
region, ME-7 and KE-D overlapped and together covered the 
complete sequence of mouse ALK-3. The predicted amino-acid 

30 sequence of mouse ALK-3 is very similar to the human 
sequence; only 8 amino-acid residues differ (98* identity; 
see SEQ ID No. 14) and the calculated molecular mass of the 
primary translated product without the putative signal 
sequence is 57,447 Da. 

35 of the clones obtained from the initial library 

screening with PCR product 11. 1 # four clones hybridized to 
the probe corresponding to the conserved kinase domain of 
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ALK-4 but not to probes from more divergent parts of ALK-1 
to -4. Analysis of these clones revealed that they have an 
identical sequence which differs from those of ALK-1 to -5 
and was termed ALK-6. The longest clone KE6 with a 2*0 kb 
5 insert was completely sequenced yielding a 1952 bp fragment 
consisting of an open reading frame of 1506 bp (502 amino* 
acids), flanked by a 5' untranslated sequence of 186 bp, 
and a 3' untranslated sequence of 160 bp* The nucleotide 
and predicted amino-acid sequences of mouse ALK-6 are shown 

10 in SEQ ID Nos. 17 and 18* No polyadenylation signal was 
found in the 3' untranslated region of ME6, indicating that 
the cDNA was internally primed in the 3' end* Only one ATG 
codon was found in the 5' part of the open reading frame, 
which fulfils the rules for translation initiation (Kozak, 

15 supra ) , and was preceded by an in-frame stop codon at 
nucleotides 163-165. However, a typical hydrophobic leader 
sequence was not observed at the N terminus of the 
translated region. Since there is no ATG codon and 
putative hydrophobic leader sequence, this ATG codon is 

20 likely to be used as a translation initiation site. The 
calculated molecular mass of the primary translated product 
with the putative signal sequence is 55,576 Da* 

House ALK-1 (clone AM 6 with 1*9 kb insert) was 
obtained from the mouse placenta JLZAPII cDNA library using 

25 human ALK-1 cDNA as a probe (see SEQ ID No. 11) • Mouse 
ALK-4 (clone 8al with 2*3kb insert) was also obtained from 
this library using human ALK-4 cDNA library as a probe (SEQ 
ID Ho. 15)* 

To summarise, clones HP22, HF57, OHF1, ONF3, 0NF4 and 
30 HP29 encode the same gene, ALK-1* Clone AM6 encodes mouse 
ALK-1. HP53, RF64 and HP84 encode the same gene, ALK-2. 
ONF5, ONF2 and 0N11 encode the same gene ALK-3. ME-7 and 
ME-D encode the mouse counterpart of human ALK-3. 11H8 
encodes a different gene ALK-4, whilst Sal encodes the 
35 mouse equivalent. EHBLA encodes ALK-5, and ME -6 encodes 
ALK-6. 
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The sequence alignment between the 6 ALK genes and 
TBR-II, mActR-II and ActR-IIB is shown in Figure 3, These 
molecules have a similar domain structure; an N-terminal 
predicted hydrophobic signal sequence (von Heijne (1986) 
5 Nucl. Acids Res, 1£: 4683-4690) is followed by a relatively 
small extracellular cysteine-rich ligand binding domain, a 
single hydrophobic transmembrane region (Kyte & Doolittle 
(1982) J. Mol. Biol. 152 r 105-132) and a C~terminal 
intracellular portion, which consists almost entirely of a 

10 kinase domain (Figures 3 and 4) . 

The extracelluar domains of these receptors have 
cysteine-rich regions, but they show little sequence 
similarity; for example, less than 20% sequence identity is 
found between £af-l, ActR-II, TBR-II and ALK-5. The ALKs 

15 appear to form a subfamily as they show higher sequence 
similarities (15-47% identity) in their extracellular 
domains. The extracellular domains of ALK-5 and ALK- 4 have 
about 29% sequence identity. In addition, ALK- 3 and ALK-6 
share a high degree of sequence similarity in their 

20 extracellular domains (46% identity). 

The positions of many of the cysteine residues in all 
receptors can be aligned, suggesting that the extracellular 
domains may adopt a similar structural configuration. See 
Figure 5 for ALKs-1,-2,-3 4- 5. Each of the ALKs (except 

25 ALK-6) has a potential N-linked glycosylation site, the 
position of which is conserved between ALK-1 and ALK-2, and 
between ALK-3, ALK-4 and ALK-5 (see Figure 4). 

The sequence similarities in the kinase domains 
between ActR-II, TBR-II and ALK-5 are approximately 

30 40%, whereas the sequence similarity between the ALKs 1 to 
6 is higher (between 59% and 90%; see Figure 6)* Pairwise 
comparison using the Jutun-Hein sequence alignment program 
(Hein (1990) Meth, Enzymol., 1S2, 626-645), between all 
family members, identifies the ALKs as a separate subclass 

35 among serine/threonine kinases (Figure 7). 

The catalytic domains of kinases can be divided into 
12 subdomains with stretches of conserved amino-acid 
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residues* The key motifs are found in serine/threonine 
kinase receptors suggesting that they are functional 
kinases. The consensus sequence for the binding of ATP 
(Gly-X-Gly-X-X-Gly in subdomain I followed by a Lys residue 
5 further downstream in subdomain II) is found in all the 
ALKs. 

The kinase domains of daf -1, ActR-II, and ALKs show 
approximately equal sequence similarity with tyrosine and 
serine /threonine protein kinases. However analysis of the 

10 amino-acid sequences in subdomains VI and VIII, which are 
the most useful to distinguish a specificity for 
phosphorylation of tyrosine residues versus 
serine/ threonine residues (Hanks e£ (1988) Science 241 
42-52) indicates that these kinases are serine/threonine 

15 kinases; refer to Table 2. 
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TABLE 2 



KINASE 


SUB DOMAINS 




VXB 


VIII 


Serine/threonine kinase consensus 


DLKPEN 


G (T/S) XX 
(Y/F) X 


Tyrosine kinase consensus 


DLAARN 


XP(I/V) 
(K/R) W 
(T/K) 


Act R-II 


DIKSXN 


GTRRYM 


Act R-IIB 


DFKSKN 


GTRRYM 


TBR-II 


DLKSSN 


GTARYM 


ALK-I 


DFKSRN 


GTKRYM 


ALK -2, -3, -4, -5, & -6 


DLKSKN 


GTKRYM 



The sequence motifs DLKSKN (Subdomain VIB) and GTKRYM 
(Subdomain VIII) , that are found in most of the 
serine/ threonine kinase receptors, agree well with the 

15 consensus sequences for all protein serine/threonine kinase 
receptors in these regions. In addition, these receptors, 
except for ALK-1, do not have a tyrosine residue surrounded 
by acidic residues between subdomains VII and VIII, which 
is common for tyrosine kinases. A unique characteristic of 

20 the members of the ALK serine/threonine kinase receptor 
family is the presence of two short inserts in the kinase 
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domain between subdomains VIA and VIB and between 
subdomains X and XI. In the intracellular domain, these 
regions, together with the juxtamembrane part and C- 
terminal tail, are the most divergent between family 

5 members (see Figures 3 and 4). Based on the sequence 
similarity with the type II receptors for TGF-B and 
activin, the C termini of the kinase domains of ALKs -1 to 
-6 are set at Ser-4B5, Ser-501, Ser-527, Gln-500, Gln-498 
and Ser-497, respectively. 
10 r v ™*> Expression 

The distribution of ALK-1, -2, -3, -4 was determined 
by Northern blot analysis. A Northern blot filter with 
mRNAs from different human tissues was obtained from 
Clontech (Palo Alto, C.A.). The filters were hybridized 

15 with 32 P-labelled probes at 42°C overnight in 50* 
formaldehyde, 5 x standard saline citrate (SSC; IxSSC is 
5CmM sodium citrate, pH 7.0, 150 mM NaCl) , 0.1* SDS, 50 mH 
sodium phosphate, 5 x Denhardt's solution and 0.1 mg/al 
salmon sperm DNA. In order to minimize cross- 

20 hybridization, probes were used that did not encode part of 
the kinase domains, but corresponded to the highly diverged 
sequences of either 5' untranslated and ligand-binding 
regions (probes for ALK-1, -2 and -3) or 3' untranslated 
sequences (probe for ALK-4) . The probes were labelled by 

25 random priming using the Multiprime (or Mega-prime) DNA 
labelling system and [a- 32 P] dCTP (Feinberg & Vogelstein 
(1983) Anal. Biochem. 122: 6-13). Unincorporated label was 
removed by Sephadex G-25 chromatography. Filters were 
washed at 65°C, twice for 30 minutes in 2.5 x SSC, 0.1* SDS 

30 and twice for 30 minutes in 0.3 x SSC, 0.1* SDS before 
being exposed to X-ray film. Stripping of blots was 
performed by incubation at 90-100°C in water for 20 
minutes . 

The ALK-5 mRNA size and distribution were determined 
35 by Northern blot analysis as above. An FcpRl fragment of 
980bp of the full length ALK-5 cDNA clone, corresponding to 
the C-terminal part of the kinase domain and 3' 



WO 94/11502 



PCT/GB93/02367 



22 

untranslated region (nucleotides 1259-2232 in SEQ ID No. 9) 
was used as a probe. The filter vas washed twice in 0.5 x 
SSC, 0.1% SDS at 55°C for 15 minutes. 

Using the probe for ALK-1. two transcripts of 2.2 and 
5 4.9kfc vere detected. The ALK-1 expression level varied 
strongly between different tissues, high in placenta and 
lung, moderate in heart, muscle and kidney, and low (to not 
detectable) in brain, liver and pancreas. The relative 
ratios between the two transcripts were similar in most 

10 tissues; in kidney, however, there was relatively more of 
the 4.9 kb transcript. By reprobing the blot with a probe 
for ALK-2, one transcript of 4.0 kb was detected with a 
ubiquitous expression pattern. Expression was detected in 
every tissue investigated and was highest in placenta and 

15 skeletal muscle. Subsequently the blot was reprobed for 
ALK-3. One major transcript of 4.4 kb and a minor 
transcript of 7.9 kb were detected. Expression was high in 
skeletal muscle, in which also an additional minor 
transcript of 10 kb was observed. Moderate levels of AIX-3 

20 ulRNA were detected in heart, placenta, kidney and pancreas, 
and low (to not detectable) expression was found in brain, 
lung and liver. The relative ratios between the different 
transcripts were similar in the tested tissues, the 4.4 kb 
transcript being the predominant one, with the exception 

25 for brain where both transcripts were expressed at a 
similar level. Probing the blot with ALK-4 indicated the 
presence of a transcript with the estimated size of 5.2 kb 
and revealed an ubiquitous expression pattern. The results 
of northern blot analysis using the probe for ALK-5 showed 

30 that a 5.5 kb transcript is expressed in all human tissues 
tested, being most abundant in placenta and least abundant 
in brain and heart. 

The distribution of mRNA for mouse ALK-3 and -6 in 
various mouse tissues was also determined by Northern blot 

35 analysis. A multiple mouse tissue blot was obtained from 
Clontech, Palo Alto, California, U.S.A. The filter was 
hybridized as described above with probes for mouse ALK-3 
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and ALK-6. The E£fiRI-£s£I restriction fragment , 
corresponding to nucleotides 79-1100 of ALK-3, and the 
Sac l- Hoa l fragment, corresponding to nucleotides 57-720 of 
ALK-6, were used as probes. The filter vas washed at 65°C 
5 twice for 30 minutes in 2.5 x SSC, 0.1% SDS and twice for 
30 minutes with 0.3 x SSC, 0.1% SDS and then subjected to 
autoradiography . 

Using the probe for mouse ALK-3, a 1.1 Jcb transcript 
was found only in spleen. By reprobing the blot with the 

10 ALK-6 specific probe, a transcript of 7.2 kb was found in 
brain and a weak signal was also seen in lung. No other 
signal was seen in the other tissues tested, i.e. heart, 
liver, skeletal muscle, kidney and testis. 

All detected transcript sizes were different, and thus 

15 no cross-reaction between mRNAs for the different ALKs was 
observed when the specific probes were used. This suggests 
that the multiple transcripts of ALK-1 and ALK-3 are coded 
from the same gene. The mechanism for generation of the 
different transcripts is unknown at present; they may be 

20 formed by alternative mRNA splicing, differential 
polyadenylation, use of different promotors, or by a 
combination of these events. Differences in mRNA splicing 
in the regions coding for the extracellular domains may 
lead to the synthesis of receptors with different 

25 affinities for ligands, as was shown for mActR-IIB 
(Attisano e£ fil (1992) Cell - 97-108) or to the 

production of soluble binding protein. 

The above experiments describe the isolation of 
nucleic acid sequences coding for new family of human 

30 receptor kinases. The cDNA for ALK-5 was then used to 
determine the encoded protein size and binding properties. 
Properties of the ALKs cDNA Encoded Proteins 

To study the properties of the proteins encoded by the 
different ALK cDNAs, the cDHA for each ALK was subcloned 

35 into a eukaryotic expression vector and transfected into 
various cell types and then subjected to 
immunoprecipitation using a rabbit antiserum raised against 
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a synthetic peptide corresponding to part of the 
intracellular juxtameabrane region. This region is 
divergent in sequence between the various serine/threonine 
kinase receptors. The following amino-acid residues were 
5 used: 

ALK-1 145-166 
ALK-2 151-172 
ALK-3 181-202 
ALK-4 153-171 
10 ALK-5 158-179 
A1K-6 151-168 

The rabbit antiserum against ALK-5 was designated VPN. 
The peptides were synthesized with an Applied 
Biosystems 430A Peptide Synthesizer using t-butoxycarbonyl 
15 chemistry and purified by reversed-phase high performance 
liquid chromatography. The peptides were coupled to 
keyhole limpet haeaocyanin (Calbiochem-Behring) using 
glutar aldehyde, as described by Guillick ££ fil (1985) EMBO 
j. 2869-2877. The coupled peptides were mixed with 
20 Freunds adjuvant and used to immunize rabbits. 
Transient transfection of the ALK-5 cDNA 

COS-1 cells (American Type Culture Collection) and the 
R mutant of MvlLu cells (for references, see below) were 
cultured in Dulbecco's modified Eagle's medium containing 
25 10% fetal bovine serum (FBS) and 100 units/ml penicillin 
and 50 MS 1*1 streptomycin in 5% C0 2 atmosphere at 37°C. 
The ALK-5 cDNA (nucleotides (-76) - 2232) , which includes 
the complete coding region, was cloned in the pSV7d vector 
(Truett £t Si, (1985) DNA £, 333-349), and used for 
30 transfection. Transfection into COS-1 cells was performed 
by the calcium phosphate precipitation method (Wigler fit fil 
(1979) Cell 1£, 777-785). Briefly, cells were seeded into 
6-well cell culture plates at a density of 5x10 s 
cells/well, and transfected the following day with 10 fiq of 
35 recombinant plasmid. After overnight incubation, cells 
were washed three times with a buffer containing 25 mM 
Tris-HCl, pH 7.4, 138 mM NaCl, 5 mM KCl, 0.7 mM CaCl 2 , 0.5 
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mM HgCl 2 and 0*6 mM Na^HPO^, and then incubated vith 
Dulbecco's modified Eagle's medium containing FBS and 
antibiotics. Two days after transf ection, the cells vere 
metabolically labelled by incubating the cells for 6 hours 
5 in methionine and cysteine-free MCDB 104 medium vith 150 
jiCi/ml of [^S] -methionine and { l5 S ] -cysteine (in vivo 
labelling mix; Amersham) . After labelling, the cells were 
washed with 150 mH NaCI, 25 mM Tris-HCl, pH 7.4, and then 
solubilized with a buffer containing 20mM Tris-HCl, pH 7.4, 

10 150 mM NaCI, 10 mM EDTA, 1% Triton X-100, 1% deoxycholate, 
1.5* Trasylol (Bayer) and 1 mM phenylmethylsulf onylf luoride 
(PMSF; Sigma). After 15 minutes on ice, the cell lysates 
vere pelleted by centrifugation, and the supernatants were 
then incubated with 7 ^1 of preimmune serum for 1.5 hours 

15 at 4°C. Samples were then given 50 jjI of protein A- 
Sepharose (Pharmacia-LKB) slurry (50* packed beads in 150 
mM NaCI, 20 mM Tris-HCl, pH 7.4, 0.2* Triton X100) and 
incubated for 45 minutes at 4°C. The beads were spun down 
by centrifugation, and the supernatants (1 ml) were then 

20 incubated with either 7 /xl of preimmune serum or the VPN 
antiserum for 1.5 hours at 4°C. For blocking, 10 ^g of 
peptide was added together with the antiserum. Immune 
complexes were then given 50 til of protein A-Sepharose 
(Pharmacia - LKB) slurry (50* packed beads in 150 mM NaCI, 

25 20mM Tris-HCl, pH 7.4, 0.2* Triton X-100) and incubated for 
45 minutes at 4°C. The beads were spun down and washed 
four times with a washing buffer (20 mM Tris-HCl, pH 7.4, 
500 mM NaCI, 1* Triton X-100, 1* deoxycholate and 0.2* 
SDS) , followed by one wash in distilled water. The immune 

30 complexes were eluted by boiling for 5 minutes in the SDS- 
sample buffer (100 mM Tris-HCl, pH 8.8, 0.01* bromophenol 
blue, 36* glycerol, 4* SDS) in the presence of 10 mM DTT, 
and analyzed by SDS-gel electrophoresis using 7-15* 
polyacrylamide gels (Blobel and Dobberstein, (1975) J. Cell 

35 Biol. 62r 835-851). Gels were fixed, incubated with 
Amplify (Amersham) for 20 minutes, and subjected to 
f luorography. A component of 53Da was seen. This 
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component vas not seen when preimmune serum vas used, or 
when 10 pg blocking peptide vas added together with the 
antiserum. Moreover, it vas not detectable in samples 
derived from untransfected COS-1 cells using either 
5 pre immune serum or the antiserum, 
pioestion with Endoglycosidase F 

Samples immunoprecipitated with the VPN antisera 
obtained as described above were incubated with 0.5 U of 
endoglycosidase F (Boehringer Mannheim Biochemica) in a 

10 buffer containing 100 mM sodium phosphate, pH 6.1, 50 mM 
EDTA, 1% Triton X-100, 0.1% SDS and 1% £-mercaptoethanol at 
37°C for 24 hours. Samples were eluted by boiling for 5 
minutes in the SDS-sample buffer, and analyzed by SDS- 
poly aery lamide gel electrophoresis as described above. 

15 Hydrolysis of N-linXed carbohydrates by endoglycosidase F 
shifted the 53 kDa band to 51 kDa. The extracelluar domain 
of ALK-5 contains one potential acceptor site for N- 
glycosylation and the size of the deglycosylated protein is 
close to the predicted size of the core protein. 

20 Establishment of PAE Cell Lines Expressing ALK-5 

In order to investigate whether the ALK-5 cDNA encodes 
a receptor for TGF-B , porcine aortic endothelial (PAE) 
cells were transfected with an expression vector containing 
the ALK-5 cDNA, and analyzed for the binding of 12S I-TGF-B1. 

25 PAE cells were cultured in Ham's F-12 medium 

supplemented with 10% FBS and antibiotics (Miyazono fll. , 
(1988) J. Biol. Chem. 263 , 6407-6415). The ALK-5 CDNA vas 
cloned into the cytomegalovirus (CHV) -based expression 
vector pcDNA I/NEO (Invitrogen) , and transfected into PAE 

30 cells by electroporation. After 48 hours, selection vas 
initiated by adding Geneticin (G418 sulphate; Gibco - BRL) 
to the culture medium at a final concentration of 0.5 mg/ml 
(Westermark fit fil. , (1990) Proc. Natl. Acad. Sci. USA £2, 
128-132) . Several clones were obtained, and after analysis 

35 by immunoprecipitation using the VPN antiserum, one clone 
denoted PAE/TBR-1 vas chosen and further analyzed. 
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lunation Pf TGF-B1. Binding and Af^ j tv CrossHnVing 

Recombinant human TGF-B1 was iodinated using the 
chloramine T method according to Frolik a±. , (1984) J. 
Biol. Chea. 251, 10995-11000. Cross-linking experiments 
5 were performed as previously described (Ichijo ei g±. , 
(1990) Exp. Cell Res. 231, 263-269) . Briefly, cells in 6- 
well plates were washed with binding buffer (phosphate- 
buffered saline containing 0.9 mM CaCl 2 , 0.49 mM KgCl 2 and 
1 mg/ml bovine serum albumin (BSA)), and incubated on ice 

10 in the same buffer with ,25 I-TGF-B1 in the presence or 
absence of excess unlabelled TGF-B1 for 3 hours. Cells 
were washed and cross-linking was done in the binding 
buffer without BSA together with 0.28 aM disuccinimidyl 
suberate (DSS; Pierce Chemical Co.) for 15 minutes on ice. 

15 The cells were harvested by the addition of 1 ml of 
detachment buffer (10 mM Tris-HCl, pH 7.4, 1 mM EDTA, 10% 
glycerol, 0.3 mM PMSF) . The cells were pelleted by 
centrifugation, then resuspended in 50 jil of solubilization 
buffer (125 mM NaCl, 10 mM Tris-HCl, pH 7.4, 1 mM EDTA, 1% 

20 Triton X-100, 0.3 mM PMSF, 1% Trasylol) and incubated for 
40 minutes on ice. Cells were centrifuged again and 
supernatants were subjected to analysis by SDS-gel 
electrophoresis using 4-15% polyacrylamide gels, followed 
by autoradiography. 12S I-TGF-B1 formed a 70 kDa cross- 

25 linked complex in the transfected PAE cells (PAE/TAR-l 
cells) . The size of this complex was very similar to that 
of the TGF-B type I receptor complex observed at lower 
amounts in the untransf ected cells. A concomitant increase 
of 94 kDa TGF-B type II receptor complex could also be 

30 observed in the PAE/TBR-I cells. Components of 150-190 
kDa, which may represent crosslinked complexes between the 
type I and type II receptors, were also observed in the 
PAE/TBR-I cells. 

In order to determine whether the cross-linked 70 kDa 

35 complex contained the protein encoded by the ALK-5 cDNA, 
the affinity cross-linking was followed by 
immunoprecipitation using the VPN antiserum. For this, 
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cells in 25 cm 2 flasks vere used. The supernatants 
obtained after cross-linking vere incubated with 7 jil of 
preimmune serum or VPN antiserum in the presence or absence 
of 10 tig of peptide for 1.5h at 4°C. Immune complexes vere 
5 then added to 50 yl of protein A-Sepharose slurry and 
incubated for 45 minutes at 4°C. The protein A-Sepharose 
beads vere vashed four times with the vashing buffer, once 
vith distilled vater, and the samples vere analyzed by SDS- 
gel electrophoresis using 4*15* polyacrylamide gradient 

10 gels and autoradiography. A 70 kDa cross-linked complex 
vas precipitated by the VPN antiserum in FAE/TBR-1 cells, 
and a veaker band of the same size vas also seen in the 
untransfected cells , indicating that the untransf ected PAE 
cells contained a lov amount of endogenous ALK-5. The 70 

15 kDa complex vas not observed vhen preimmune serum vas used, 
or vhen immune serum vas blocked by 10 ng of peptide. 
Moreover , a coprecipitated 94 kDa component could also be 
observed in the PAE/TBR-I cells. The latter component is 
likely to represent a TGF-B type II receptor complex, since 

20 an antiserum, termed DRL, vhich vas raised against a 
synthetic peptide from the C-terminal part of the TGF-B 
type II receptor, precipitated a 94 kDa TGF-B type II 
receptor complex, as veil as a 70 kDa type I receptor 
complex from FAE/TBR-I cells. 

25 The carbohydrate contents of ALK-5 and the TGF-B type 

II receptor vere characterized by deglycosylation using 
endoglycosidase F as described above and analyzed by SDS- 
polyacrylamide gel electrophoresis and autoradiography. 
The ALK-5 cross-linked complex shifted from 70 kDa to 66 

30 fcDa, vhereas that of the type II receptor shifted from 94 
kDa to 82 kDa. The observed larger shift of the type II 
receptor band compared vith that of the ALK-5 band is 
consistent vith the deglycosylation data of the type I and 
type II receptors on rat liver cells reported previously 

35 (Cheifetz e£ £l (1988) J. Biol* Chem. Z§1$ 16984-16991), 
and fits veil vith the fact that the porcine TGF-B type II 
receptor has tvo N-glycosylation sites (Lin e£ (1992) 
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Cell £&, 775-785) , whereas ALK-5 has only one (see SEQ ID 
No. 9). 

Binding of TGF-B1 to the type 1 receptor is known to 
be abolished by transient treatment of the cells with 
5 dithiothreitol (DTT) (Cheifetz and Massague (1991) J. Biol. 
Chem. ££6, 20767-20772; Hrana £l (1992) Cell H, 1003- 
1014) . When analyzed by affinity cross-linking, binding of 
12S I-TGF-B1 to ALK-5, but not to the type II receptor, was 
completely abolished by DTT treatment of PAE/TBR-1 cells. 

10 Affinity cross-linking followed by immunoprecipitation by 
the VPN antiserum showed that neither the ALK-5 nor the 
type II receptor complexes was precipitated after DTT 
treatment, indicating that the VPN antiserum reacts only 
with ALK-5. The data show that the VPN antiserum 

15 recognizes a TGF-B type I receptor, and that the type I and 
type II receptors form a heteromeric complex. 
12S t-tgf-B1 Binding & Affinity Crosslinkin a of Transfected 
cos cells 

Transient expression plasmids of ALKs -1 to -6 and 

20 TBR-II were generated by subcloning into the pSV7d 
expression vector or into the pcDNA I expression vector 
(Invitrogen) . Transient transf ection of COS-1 cells and 
iodination of TGF-B1 were carried out as described above. 
Crosslinking and immunoprecipitation were performed as 

25 described for PAE cells above. 

Transfection of cDNAs for ALKs into COS-1 cells did 
not show any appreciable binding of t2S I-TGFBl, consistent 
with the observation that type I receptors do not bind TGF- 
fi in the absence of type II receptors. Khen the TBR-II 

30 cDNA was co-transfected with cDNAs for the different AIJCs, 
type I receptor-like complexes were seen, at different 
levels, in each case. COS-1 cells transf ected with TBR-II 
and ALK cDNAs were analyzed by affinity crosslinking 
followed by immunoprecipitation using the DHL antisera or 

35 specific antisera against ALKs. Each one of the ALKs bound 
,25 I-TGF-B1 and was coimmunoprecipitated with the TBR-II 
complex using the DRL antiserum. Comparison of the 
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efficiency of the different ALKs to form heteroaeric 
complexes with TBR-II, revealed that ALK-5 formed such 
complexes more efficiently than the other ALKs* The size 
of the crosslinked complex vas larger for ALK-3 than for 
5 other ALKs, consistent with its slightly larger size. 
Expression of the ALK Protein in Different Cell Types 

Two different approaches were used to elucidate which 
ALK's are physiological type I receptors for TGF-B. 

Firstly, several cell lines vere tested for the 

10 expression of the ALK proteins by cross -linking followed by 
immunoprecipitation using the specific antiseras against 
ALKs and the TGF-B type II receptor. The mink lung 
epithelial cell line, MvlLu, is widely used to provide 
target cells for TGF-B action and is well characterized 

15 regarding TGF-B receptors (Laiho £t £l (1990) J* Biol. 
Chem. Z3Sa 18518-18524; Laiho £l (1991) J. Biol. Chem. 
266 , 9108-9112). Only the VPN antiserum efficiently 
precipitated both type I and type II TGF-B receptors in the 
wild type MvlLu cells. The DRL antiserum also precipitated 

20 components with the same size as those precipitated by the 
VPN antiserum. A mutant cell line (R mutant) which lacks 
the TGF-B type I receptor and does not respond to TGF-B 
(Laiho e£ al, supra 1 was also investigated by cross-linking 
followed by immunoprecipitation • Consistent with the 

25 results obtained by Laiho £fe £l (1990), supra the type III 
and type II TGF-B receptor complexes, but not the type I 
receptor complex, were observed by affinity crosslinking. 
Crosslinking followed by immunoprecipatition using the DRL 
antiserum revealed only the type II receptor complex, 

30 whereas neither the type I nor type II receptor complexes 
was seen using the VPN antiserum. When the cells were 
metabolically labelled and subjected to immunoprecipitation 
using the VPN antiserum, the 53 kDa ALK-5 protein was 
precipitated in both the wild-type and R mutant KvlLu 

35 cells. These results suggest that the type I receptor 
expressed in the R mutant is ALK-5, which has lost the 
affinity for binding to TGF-B after mutation. 
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2he type I and type II TGF-B receptor complexes could 
be precipitated by the VPN and DHL antisera in other cell 
lines, including human foreskin fibroblasts (AG1518) , human 
lung adenocarcinoma cells (A549), and human oral squamous 
5 cell carcinoma cells (HSC-2). Affinity cross-linking 
studies revealed multiple TGF-B type I receptor-like 
complexes of 70-77 kDa in these cells. These components 
were less efficiently competed by excess unlabelled TGF-B1 
in HSC-2 cells. Moreover, the type II receptor complex vas 

10 low or not detectable in A549 and HSC-2 cells. Cross- 
linking followed by immun ©precipitation revealed that the 
VPN antiserum precipitated only the 70 kDa complex among 
the 70-77 kDa components. The DRL antiserum precipitated 
the 94 kDa type II receptor complex as well as the 70 kDa 

15 type I receptor complex in these cells , but not the 
putative type I receptor complexes of slightly larger 
sizes. These results suggest that multiple type I TGF-B 
receptors may exist and that the 70 kDa complex containing 
ALK-5 forms a heteromeric complex with the TGF-B type II 

20 receptor cloned by Lin g£ &1 (1992) Cell 775-785, more 
efficiently that the other species. In rat 

pheochromocytoma cells (PC12) which have been reported to 
have no TGF-B receptor complexes by affinity cross-linking 
(Massagufe fit fll (1990) Ann. N.Y. Acad. Sci. 593 . 59-72) , 

25 neither VPN nor DRL antisera precipitated the TGF-B 
receptor complexes. The antisera against ALKs -1 to -4 and 
ALK6 did not efficiently immunoprecipitate the crosslinked 
receptor complexes in porcine aortic endothelial (PAE) 
cells or human foreskin fibroblasts* 

30 Next, it was investigated whether ALKs could restore 

responsiveness to TGF-B in the R mutant of MvlLu cells, 
which lack the ligand-binding ability of the TGF-B type I 
receptor but have intact type II receptor. Wild-type MvlLu 
cells and mutant cells were transfected with ALK cDNA and 

35 were then assayed for the production of plasminogen 
activator inhibitor-1 (PAI-1) which is produced as a result 
of TGF-B receptor activation as described previously by 
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Laiho £i JSi (1991) Mol. Cell Biol. H, 972-978. Briefly, 
cells were added with or without 10 ng/ml of TGF-Bl for 2 
hours in serum-free HCDB 104 without methionine. 
Thereafter , cultures were labelled with [^S] methionine (40 
5 pCi/ml) for 2 hours. The cells were removed by washing on 
ice once in PBS, twice in 10 mM Tris-HCl (pH 8.0), 0.5% 
sodium deoxycholate, 1 mM PMSF, twice in 2 mM Tris-HCl (pH 
8.0) , and once in PBS. Extracellular matrix proteins were 
extracted by scraping cells into the SDS-sample buffer 

10 containing DTT, and analyzed by SDS-gel electrophoresis 
followed by fluorography using Amplify. PAI-1 can be 
identified as a characteristic 45JcDa band (Laiho £i si 
(1991) Mol. Cell Biol. 11, 972-978). Wild-type MvlLu cells 
responded to TGF-B and produced PAI-1, whereas the R mutant 

IS clone did not, even after stimulation by TGF-Bl. Transient 
transfection of the ALX-5 cDNA into the R mutant clone led 
to the production of PAI-1 in response to the stimulation 
by TGF-B1, indicating that the ALK-5 cDNA encodes a 
functional TGF-B type I receptor. In contrast, the R 

20 mutant cells that were transfected with other ALKs did not 
produce PAI-1 upon the addition of TGF-B 1, 

Using similar approaches as those described above for 
the identification of TGF-B-binding ALKs, the ability of 
ALKs to bind activin in the presence of ActRIl was 

25 examined. COS-l cells were co-transfected as described 
above. Recombinant human activin A was iodinated using the 
chloramine T method (Mathews and Vale (1991) Cell ,£■>, 973- 
982) . Transfected COS-l cells were analysed for binding 
and cross linking of I-activin A in the presence or 

30 absence of excess unlabelled activin A. The crosslinked 
complexes were subjected to immunoprecipitation using DRL 
antisera or specific ALK antisera. 

All ALKs appear to bind activin A in the presence of 
Act R-II. This is more clearly demonstrated by affinity 

35 cross-linking followed by immunopreciptation. ALK-2 and 
ALK-4 bound I-activin A and were coimmunoprecipitated 
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with ActR-II. Other ALKs also bound 125 I-activin A but with 
a lover efficiency compared to ALK-2 and ALK-4. 

In order to investigate whether ALKs are physiological 1 
activin type I receptors, activin responsive cells vere 
5 examined for the expression of endogenous activin type I 
receptors. MvlLu cells, as veil as the R mutant, express 
both type I and type II receptors for activin, and the R 
mutant cells produce PAI-1 upon the addition of activin A* 
MvlLu cells vere labeled vith I-activin A, cross-linked 
' 10 and immunoprecipitated by the antisera against ActR-II or 
ALKs as described above. 

The type I and type II receptor complexes in MvlLu 
cells vere immunoprecipitated only by the antisera against 
ALK-2, ALK-4 and ActR-II. Similar results vere obtained 

15 using the R mutant cells. PAE cells do not bind activin 
because of the lack of type II receptors for activin, and 
so cells vere transfected vith a chimeric receptor, to 
enable them to bind activin, as described herein. A 
plasmid (chim A) containing the extracelluar domain and C- 

20 terminal tail of Act R-II (amino-acids -19 to 116 and 465 
to 494 , respectively (Mathews and Vale (1991) Cell, ££, 
973-982)) and the kinase domain of TBR-II (amino-acids 160- 
543) (Lin e£ al (1992) Cell, £1, 775-785) was constructed 
and transfected into pcDNA/neo (Invitrogen) . PAE cells 

25 vere stably transfected vith the chim A plasmid by 
electroporation, and cells expressing the chim A protein 
vere established as described previously. PAE/ Chim A cells 
vere then subjected to I-activin A labelling crosslinking 
and immunoprecipitation as described above. 

30 Similar to MvlLu cells, activin type I receptor 

complexes in PAE /Chim A cells vere immunoprecipitated by 
the ALK-2 and ALK-4 antisera. These results shov that both 
ALK-2 and ALK-4 serve as high affinity type I receptors for 
activin A in these cells. 

35 ALK-1, ALK-3 and ALK-6 bind TGF-B1 and activin A in 

the presence of their respective type II receptors, but the 
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functional consequences of the binding of the ligands 
remains to be elucidated. 

The invention has been described by way of example 
only, without restriction of its scope. The invention is 
5 defined by the subject natter herein, including the claims 
that follow the immediately following full Sequence 
Listings. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANTS 

(A) NAME: Ludvig Inatituta- for Canear Rtsaarch 

(B) STREET: St. Mary'a Hospital Medical School, Norfolk 

Plaea 

(C) CITY: Paddington, London 

(E) COUNTRY: Unit ad Kingdom 

(F) POSTAL CODS (ZIP): K2 IPG 

(ii) TITLE OF INVENTION: PROTEINS HAVING SERINE /THREONINE KINASE 
DOMAINS f CORRESPONDING NUCLEIC ACID MOLECULES , AND THEIR 
USE 

(ill) NUMBER OF SEQUENCES: 29 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy ditk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEMS PC-DOS /Mp -DOS 

(D) SOFTWARE: Patantln Relaaac #1.0, Varaion #1.25 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1984 baaa pairs 

(B) TYPE: nuelaic acid 

(C) STRANDEDNESS ; unknown 

(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(v) FRAGMENT TYPE: intarnal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo aapiana 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 283, • 1791 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

AGCAAACCGT TTATTAGCAG GGACTCCTGG AGCTCCCCCA GGCAGGAAGA CGCTGGAATA 60 

AGAAACATTT TTGCTCCAGC CCCCATCCCA GTCCCGGGAG OCTGCCGCCC CAGCTGCGCC 120 

GAGCGAGCCC CTCCCCGGCT CCAGCCCCGT CCGGGGCCGC GCCCGACCCC AGCCCCCCGT 180 

CCAGCGCTGG CGGTGCAACT CCGCCCGCCC GGTGGAGGGG AGGTGGCCCC GGTCCGCCCA 240 
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AGGCTAGCGC CCCCCCACCC GCAGAGCGCG CCGAGAGGGA CC ATO ACC TTG CGC 294 

Met Thr Leu Cly 
1 

TCC CCC AGO AAA CGC CTT CTG ATC CTG CTG ATG CCC TTG GTG ACC CAG 342 
Ser Pro Arg Lys Gly Leu Leu Het Leu Leu Kit Ala Leu Val Thr Gin 
5 10 15 20 

GGA GAC CCT GTG AAO CCG TCT CGG GGC CCG CTG CTG ACC TCC ACC TGT 390 
Gly Asp Pro Val Lys Pro Ser Arg Gly Pro Leu Val Thr Cys Thr Cys 
25 30 35 

CAG AGC CCA CAT TCC AAG CCC CCT ACC TCC CCC CCG CCC TCC TCC ACA 438 
Clu Ser Pro His Cys Lys Gly Pro Thr Cys Arg Cly Ala Trp Cys Thr 
40 45 50 

GTA CTG CTG CTG CCG CAG GAG CCG AGC GAC CCC CAG GAA CAT CGC GGC 486 
Val Val Leu Val Arg Glu Glu Gly Arg Els Pro Gin Clu Bis Arg Gly 
55 60 65 

TCC CGC AAC TTG CAC ACG CAG CTC TGC ACC CCC CCC CCC ACC CAG TTC 534 
Cys Gly Ain Leu His Arg Clu Leu Cys Arg Cly Arg Pro Thr Clu Phe 
70 75 80 

CTC AAC CAC TAC TCC TGC CAC ACC CAC CTC TCC AAC CAC AAC CTC TCC 582 
Val Asn His Tyr Cys Cys Asp Ser His Leu Cys Asn His Asn Val Ssr 
85 90 95 100 

CTC CTC CTC CAC CCC ACC CAA CCT CCT TCC CAC CAC CCC CCA ACA CAT 630 
Leu Val Leu Clu Ala Thr Gin Pro Pro Ser Clu Cln Pro Cly Thr Asp 
105 110 115 

CCC CAG CTG CCC CTC ATC CTG CCC CCC CTG CTG CCC TTG CTC CCC CTG 678 
Cly Gin Leu Ala Leu lie Leu Cly Pro Val Leu Ala Leu Leu Ala Leu 
120 125 130 

CTG GCC CTC CCT CTC CTC CCC CTC TCG CAT CTC CCA CCC AGC CAC CAC 726 
Val Ala Leu Cly Val Leu Cly Leu Trp His Val Arg Arg Arg Gin Clu 
135 140 145 

AAG CAG CGT CCC CTC CAC ACC CAG CTG CCA CAC TCC AGT CTC ATC CTC 774 
Lys Gin Arg Gly Leu His Ser Glu Leu Cly Clu Ser Ser Leu lie Leu 
150 155 160 

AAA CCA TCT GAG CAG CCC CAC ACC ATC TTG CCC CAC CTC CTG CAC ACT 822 
Lys Ala Ser Glu Gin Gly Asp Thr Ket Leu Gly Asp Leu Leu Asp Ser 
165 170 175 180 

GAC TGC ACC ACA GCC AGT CGC TCA CGG CTC CCC TTC CTG GTG CAG AGC 870 
Asp Cys Thr Thr Gly Ser Gly Ser Cly Leu Pro Phe Leu Val Cln Arg 
185 190 195 

ACA GTG CCA CCC CAG CTT CCC TTC CTG CAC TGT GTG GGA AAA GCC CCC 918 
Thr Val Ala Arg Gin Val Ala Leu Val Clu Cys Val Gly Lys Gly Arg 
200 205 210 

TAT CCC GAA CTC TCC CGG CCC TTG TGC CAC CGT GAG AGT GTG CCC CTC 966 
Tvr Gly Clu Val Trp Arg Cly Leu Trp His Cly Glu Ser Val Ala Val 
215 220 225 
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AAG ATC TTC TCC TCG AGO CAT CAA CAC TCC TOO TTC CCC CAG ACT CAO 1014 
Lye llm Phe Ser Ser Arg Asp Clu Gin Ser Trp I>he Arg Glu Thr Glu 
230 235 240 

ATC TAT AAC ACA CTA TTC CTC AGA CAC GAC AAC ATC CTA GGC TTC ATC 1062 
II* Tyr Asn Thr Val Leu Leu Arg His Asp Asn llm Leu Gly Phe llm 
245 250 255 260 

CCC TCA GAC ATC ACC TCC CGC AAC TCG AGC ACG CAG CTG TGC CTC ATC 1110 
Aim Ser Aip Ket Thr Ser Arg Asn Ser Smr Thr Gin Leu Trp Lm\x llm 
265 270 275 

ACG CAC TAC CAC GAG CAC GGC TCC CTC TAC GAC TTT CTC CAC AGA CAG 1158 
Thr Hi* Tyr Hi* Glu His Gly Ser Leu Tyr A*p Ph* Leu Gin Arg Gin 
280 285 290 

ACG CTG GAG CCC CAT CTG GCT CTG AGG CTA GCT GTC TCC CCC GCA TGC 1206 
Thr Leu Glu Pro Hit Leu Ala Leu Arg L*u Ala Val Smr Ala Ala Cy* 
295 300 305 

GGC CTC GCC CAC CTC CAC CTG GAG ATC TTC GCT ACA CAC CGC AAA CCA 1254 
Gly Leu Ala Hi* Leu Hi* Val Glu lit Ph* Gly Thr Gin Gly Ly« Pro 
310 315 320 

GCC ATT GCC CAC CCC CAC TTC AAC AGC CCC AAT CTC CTC CTC AAC AGC 1302 
Ala II* Ala Hi* Arg A*p Ph* Ly* Smr Arg A*n Val L*u Val Ly* fier 
325 330 335 340 

AAC CTC CAC TGT TCC ATC CCC CAC CTC CGC CTC GCT CTC ATC CAC TCA 1350 
Asn Leu Gin Cy* Cy* II* Ala A«p Leu Gly Leu Ala Val Ket His Ser 
345 350 355 

CAC CCC ACC CAT TAC CTC CAC ATC CGC AAC AAC CCC ACA CTC CGC ACC 1398 
Gin Gly Ser Asp Tyr Leu Asp lie Gly Asn Asn Pro Arg Val Gly Thr 
360 365 370 

AAC CGC TAC ATC CCA CCC GAG CTC CTG GAC CAG CAC ATC CGC ACC CAC 1446 
Lys Arg Tyr Met Ala Pro Clu Val Leu Asp Clu Gin He Arg Thr Asp 
375 380 385 

TCC TTT CAG TCC TAC AAC TCC ACT GAC ATC TCG CCC TTT CGC CTC GTC 1494 
Cys Phe Glu Ser Tyr Lys Trp Thr Asp He Trp Ala Phe Gly Leu Val 
390 395 400 

CTC TCC GAC ATT CCC CGC CCC ACC ATC CTG AAT GCC ATC CTG GAC CAC 1542 
Leu Trp Glu He Ala Arg Arg Thr He Val Asn Gly He Val Glu Asp 
405 410 415 420 

TAT AGA CCA CCC TTC TAT GAT CTC CTG CCC AAT CAC CCC AGC TTT CAC 1590 
Tyr Arg Pro Pro Phe Tyr Asp Val Val Pro Asn Asp Pro Ser Ph* Glu 
425 430 435 

GAC ATC AAG AAG CTC CTG TGT GTC CAT CAG CAG ACC CCC ACC ATC CCT 1638 
Asp Met Lys Lys Val Val Cys Val Asp Gin Gin Thr Pro Thr llm Pro 
440 445 450 

AAC CGC CTC GCT CCA CAC CCC GTC CTC TCA GCC CTA CCT CAG ATG ATC 1686 
Asn Arg Leu Ala Ala Asp Pro Val Leu Ser Gly Leu Ala Gin Ket Ket 
45S 460 465 
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CGG GAG TGC TOO TAG CCA AAC CCC TCT GCC OCA CTC ACC CCG CTG COO 1734 
Arg Glu Cys Trp Tyr Pro Asn Pro Ser Ala Arg Lau Thr Ala Leu Arg 
470 475 480 

ATC AAG AAG ACA CTA CAA AAA ATT AGC AAC AGT CCA GAG AAG CCT AAA 1782 
He Lys Lys Thr Leu Gin Lya Ila Bar Am 5ar Pro Glu Lya Pro Lye 
485 490 495 500 

GTG ATT CAA TAGCCCAGCA GCACCTGATT CCTTTCTGCC TGCAGGGCGC 1831 
Val Ila Gin 

TGGGGOOGTG GOOOOCAGTG CATGGTGCCC TATCTGGGTA CAGCTAGTCT GACTCTGGTG 1891 

TGTGCTGOGG ATGGGCAGCT GCGCCTGCCT OCTCGGCCCC CAGCCCACCC AGCCAAAAAT 1951 

ACAGCTGGGC TGAAACCTGA AAAAAAAAAA AAA 1984 



(2) INFORMATION POR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 503 amino acids 
(8) TYPE: amino acid 
(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: protain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Thr Leu Gly Ser Pro Arg Lys Gly Lau Leu Met Leu Leu Met Ala 
15 10 15 

Leu Val Thr Gin Gly Asp Pro Val Lys Pro Ser Arg Gly Pro Lau Val 
20 25 30 

Thr Cys Thr Cys Glu Ser Pro His Cys Lys Gly Pro Thr Cys Arg Gly 
35 40 45 

Ala Trp Cys Thr Val Val Leu Val Arg Glu Glu Gly Arg His Pro Gin 
50 55 60 

Glu His Arg Gly Cys Cly Asn Leu His Arg Glu Leu Cys Arg Gly Arg 
65 70 75 80 

Pro Thr Glu Phe Val Asn His Tyr Cys Cys Asp Ser His Leu Cys Asn 

85 90 95 

His Asn Val Ser Leu Val Leu Glu Ala Thr Gin Pro Pro Ser Glu Gin 
100 105 110 

Pro Gly Thr Asp Gly Gin Leu Ala Leu lie Leu Gly Pro Val Leu Ala 

115 120 125 

Leu Leu Ala Leu Val Ala Leu Gly Val Leu Gly Leu Trp His Val Arg 
130 135 140 

Arg Arg Gin Glu Lys Gin Arg Gly Leu His Ser Glu Leu Gly Glu Ser 
145 150 155 160 
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Ser Ltu lit Ltu Lyt Ala Str Clu Cln Gly Asp Thr Mtt Ltu Gly Asp 

165 170 175 

Ltu Leu Asp Str Asp Cys Thr Thr Gly Str Gly Str Gly Ltu Pro Pht 

180 185 190 

Leu Val Gin Arg Thr Val Ala Arg Gin Val Ala Ltu Val Glu Cyt Val 

195 200 205 

Gly Lys Cly Arg Tyr Gly Glu Val Trp Arg Gly Ltu Trp Hit Gly Glu 
210 215 220 

Ser Val Ala Val Lys lit Pht Str Str Arg Asp Glu Gin Str Trp Pht 
225 230 235 240 

Arg Glu Thr Glu lit Tyr Am Thr Val Ltu Ltu Arg Hit Atp Atn lit 
245 250 255 

Leu Gly Pht lit Ala Str Asp Ktt Thr Str Arg Asn Str Str Thr Gin 

260 265 270 

Leu Trp Ltu lit Thr His Tyr His Glu Hit Gly Str Ltu Tyr Asp Pht 

275 280 * 265 

Leu Gin Arg Gin Thr Leu Glu Pro Hit Leu Ala Ltu Arg Leu Ala Val 
290 295 300 

Ser Ala Ala Cys Gly Ltu Ala His Ltu Hit Val Glu lit Pht Gly Thr 

305 310 315 320 

Gin Gly Lys Pro Ala lit Ala His Arg Asp Pht Lys Ser Arg Asn Val 
325 330 335 

Leu Val Lys Ser Asn Leu Gin Cys Cys lit Ala Asp Ltu Gly Ltu Ala 
340 345 350 

Val Htt His Str Gin Gly Ser Asp Tyr Ltu Asp lit Gly Asn Asn Pro 

355 360 365 

Arg Val Cly Thr Lys Arg Tyr Met Ala Pro Clu Val Leu Asp Glu Gin 
370 375 380 

lit Arg Thr Asp Cys Pht Glu Ser Tyr Lys Trp Thr Asp lit Trp Alt 

385 390 395 400 

Phe Gly Leu Val Leu Trp Glu lit Ala Arg Arg Thr lit Val Asn Gly 
405 410 415 

He Val Glu Asp Tyr Arg Pro Pro Pht Tyr Asp Val Val Pro Asn Asp 
420 425 430 

Pro Ser Pht Glu Asp Ktt Lys Lys Val Val Cys Val Asp Cln Gin Thr 
435 440 445 

Pro Thr lit Pro Asn Arg Leu Ala Ala Asp Pro Val Ltu Ser Gly Ltu 
450 455 460 

Ala Gin Ket Met Arg Glu Cys Trp Tyr Pro Asn Pro Ser Ala Arg Ltu 
465 470 475 480 
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Thr Ala Leu Arg lie Lys Lys Thr Lm\i Gin Lys 11m Ser Asn Mr Pro 
485 490 495 

Glu Lys Pro Lys Val 11m Gin 

500 

(2) INFORMATION POR SEQ 2D NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2724 bin pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: HO 

{iii) ANTI-SENSE: NO 

(v) PRAGMENT TYPE: intarnal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

{A) NAME /KEY: CDS 

(B) LOCATION : 104 ,.1630 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CTCCGAGTAC CCCAGTGACC AGAGTGAGAG AAGCTCTGAA CGAGGCCACG CGGCTTGAAG 60 

GACTGTGGGC AGATGTGACC AAGAGCCTGC ATTAAGTTGT ACA ATG CTA CAT GGA 115 

Met Val Asp Gly 

GTG ATG ATT CTT CCT CTG CTT ATC ATG ATT GCT CTC CCC TCC CCT ACT 163 
Val Met He Leu Pro Val Leu He Met He Ala Leu Pro Ser Pro Ser 
5 10 15 20 

ATG GAA GAT GAG AAG CCC AAG GTC AAC CCC AAA CTC TAC ATG TGT GTG 211 
Met Glu Asp Glu Lys Pro Lys Val Asa Pro Lys Leu Tyr Met Cys Val 
25 30 35 

TGT GAA GGT CTC TCC TGC GCT AAT GAG GAC CAC TGT GAA GGC GAG CAG 259 
Cys Glu Gly Leu Ser Cys Gly Asn Glu Asp His Cys Glu Gly Gin Gin 
40 45 50 

TGC TTT TCC TCA CTG AGC ATC AAC GAT GGC TTC CAC CTC TAC CAG AAA 307 
Cys Phe Ser Ser Leu Ser He Asn Asp Gly Phe His Val Tyr Gin Lys 
55 60 65 

CCC TGC TTC CAG GTT TAT CAG CAG GGA AAG ATG ACC TGT AAG ACC CCG 355 
Gly Cys Phe Gin Val Tyr Glu Gin Gly Lys Met Thr Cys Lys Thr Pro 
70 75 60 
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403 



451 



CCC TCC CCT CCC CAA OCT CTC CAG TCC TCC CAA CCC ttxe ma *~ 
Pro S.r Pro Cly cin Alj w Clu Cy. Cy. S£ JJ g» 

ACC AAC ATC AOG CCC CAC CTC CCC ACT XXX <v» it* 

a., A.n a. ikt „. elI to « g »» gj «J J« >« cot « 

110 

125 130 

sssaassEgsaaasaaa s " 
sgssssaaaaaaaaaaa - 

iM 160 

gssaasaasasaaasa - 

175 180 

a a a a a a a a a a a a a a a a - 

1 5 190 195 

CCT TTT CTG CTA CAA AGA ACA CTC CCT CCC CAC ATT xrx r-~ 
Pro Phe L.u Vjl Cla Ax 5 Thr V.l jS A?S SS22 2! W 

205 210 

a a a a a a a a a a a a a a a a «• 

sssKisaaagaaaasjaaaa « 
a a a a a a a a a a a a a a a a «• 

255 260 

AAT ATC TTA CCT TTC ATT CCT TCX exe »«. ... 

*» „. alY ». „. a ~ «« j«- }« J» j» gj « j„ 

. 270 275 

a a a a a a a a a a a "a a a a a •» 
a a a a a a a a a a a a g a a a 
a a a a a a a a a a a a a a a a »» 
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GGG ACC CAA GGG AAA CCA GCC ATT GCC CAT CCA GAT TTA AAG ACC AAA 1123 
Gly Thr Gin Cly Lys Pro Ala lie Ale Hit Arg Asp Leu Ly» Ser Lys 
325 330 335 340 

AAT ATT CTG CTT AAO AAG AAT CGA CAO TOT TGC ATA CCA GAT TTG GGC 1171 
Asn 11* Leu Vel Lys Lys Ash Gly Gin Cyi Cys lie Ala Asp Leu Gly 
345 350 355 

CTG GCA GTC ATG CAT TCC CAG AGC ACC AAT CAG CTT CAT GTG GGG AAC 1219 
Leu Ala Val Met Bis Ser Gin Ssr Thr Asn Gin Leu Asp Val Gly Asn 
350 365 370 

AAT CCC CGT CTG GGC ACC AAG CGC TAC ATG GCC CCC GAA GTT CTA GAT 1267 
Asn Fro Arg Vai Gly Thr Lys Arg Tyr Het Ala Pro Clu Val Lsu Asp 
375 380 385 

GAA ACC ATC CAG CTG CAT TCT TTC CAT TCT TAT AAA AGG GTC CAT ATT 1315 
Clu Thr lis Cln Val Atp Cys Phe Aip Car Tyr Lys Arg Val Asp lis 
390 395 400 

TGC GCC TTT GGA CTT CTT TTG TGG CAA GTG GCC AGG CGC ATG CTG AGC 1363 
Trp Ala Phs Gly Lsu Val Lsu Trp Glu Val Ala Arg Arg Met Val Ser 
405 410 415 420 

AAT GGT ATA GTG CAG GAT TAC AAG CCA CCG TTC TAC GAT CTG GTT CCC 1411 
Asn Gly lis Val Glu Asp Tyr Lys Pro Pro Phe Tyr Asp Val Val Pro 
425 430 435 

AAT CAC CCA AGT TTT CAA CAT ATG AGG AAG GTA GTC TGT CTG CAT CAA 1459 
Asn Asp Pro Ssr Phs Clu Asp Hst Arg Lys Val Vai Cys Val Asp Gin 
440 445 450 

CAA AGC CCA AAC ATA CCC AAC AGA TGG TTC TCA GAC CCG ACA TTA ACC 1507 
Gin Arg Pro Aon lis Pro Asn Arg Trp Phs Ssr Asp Pro Thr Leu Thr 
455 460 465 

TCT CTG CCC AAG CTA ATG AAA CAA TGC TGG TAT CAA AAT CCA TCC CCA 1555 
Ser Leu Ala Lyi Leu Met Lys Glu Cys Trp Tyr Gin Asn Pro Ser Ala 
470 475 480 

AGA CTC ACA GCA CTG CGT ATC AAA AAG ACT TTC ACC AAA ATT GAT AAT 1603 
Arg Leu Thr Ala Leu Arg lie Lys Lys Thr Lsu Thr Lys He Asp Asn 
485 490 495 500 

TCC CTC CAC AAA TTC AAA ACT CAC TGT TGACATTTTC ATAGTCTCAA 1650 
Ser Leu Asp Lyi Lsu Lys Thr Asp Cys 
505 



GAAGGAAGAT 


TTGACGTTGT 


TGTCATTGTC 


CAGCTGGGAC CTAATGCTGG 


CCTGACTGGT 


1710 


TCTCACAATG 


GAATCCATCT 


CTCTCCCTCC 


CCAAATGGCT CCTTTCACAA 


GGCACACGTC 


1770 


GTACCCAGCC 


ATGTGTTGGG 


CAGACATCAA 


AACCACCCTA ACCTCCCTCC 


ATCACTGTGA 


1830 


ACTGCGCATT 


TCACGAACTG 


TTCACACTGC 


AGAGACTAAT GTTGGACAGA 


CACTGTTGCA 


1890 ' 


AAGGTAGGGA 


CTGGAGCAAC 


ACACAGAAAT 


CCTAAAAGAG ATCTCGGCAT 


TAAGTCAGTG 


1950 


CCTTTGCATA 


CCTTTCACAA 


CTCTCCTACA 


CACTCCCCAC GGCAAACTCA 


AGGAGGTCGT 


2010 
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GAATTTTTAA TCAGCAATAT TOCCTCTGCT TCTCTTCTTT ATTGCACTAG GAATTCTTTG 2070 

CATTCCTTAC TTOCACTGTT ACTCTTAATT TTAAAGACCC AACTTGCCAA AATCTTCCCT 2130 

GCGTACTCCA CTGCTCTCTC TTTCCATAAT AGGAATTCAA TTTGGCAAAA CAAAATGTAA 2190 

TGTCAGACTT TGCTGCATTT TACACATGTG CTGATGTTTA CAATGATCCC CAACATTAGG 2250 

AATTCTTTAT ACACAACTTT GCAAATTATT TATXACTTGT GCACTTACTA GTTTTXACAA 2310 

AACTGCTTTG TGCATATCTT AAAGCTTATT TTTATCTGGT CTTATGATTT TATTACAGAA 2370 

ATGTTTTTAA CACTATACTC TAAAATGGAC ATTTTCTTTT ATTATCAGTT AAAATCACAT 2430 

TTTAAGTCCT TCACATTTCT ATGTGTGTAG ACTGTAACTX TTTTTCAGTT CATATGCAGA 2490 

ACGTATTTAG CCATTACCCA CGTGACACCA CCGAATATAT TATCCATTTA GAAGCAAAGA 2550 

TTTCAGTAGA ATTTTAGTCC TGAACGCTAC GGGGAAAATG CATTTTCTTC AGAATXATCC 2610 

ATTACGTGCA TTTAAfcCTCT GCCAGAAAAA AATAACTATT TTGTTTTAAT CTACTTTTTG 2670 

TATTTAGTAG TTATTTGTAT AAATTAAATA AACTCTTXTC AAGTCAAAAA AAAA 2724 

(2) INFORMATION TOR SEQ 10 HO* 4: 

(i) SEQUENCE CHARACTERISTICS: 
(fc) LENGTH: 509 Amino acids 
(B) TYPE i amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO* 4: 

Met Val Asp Cly Val Mat lis Leu Pro Val Lau I la Mat Zla Ala Leu 

15 10 15 

Pro Ser Pro Ser Met Glu Asp Glu Lys Pro Lys Val Asn Pro Lys Lau 
20 25 30 

Tyr Met Cys Val Cys Glu Gly Lau Ser Cys Gly Asn Glu Asp His Cys 
35 40 45 

Glu Gly Gin Gin Cys Pha Sar Ser Lau Ser lie Asn Asp Gly Pha Bis 

50 55 €0 

Val Tyr Gin Lys Gly Cys Pha Gin Val Tyr Glu Gin Gly Lys Mat Thr 
65 70 75 60 

Cys Lys Thr Pro Pro Sar Pro Gly Gin Ala Val Glu Cys Cys Gin Gly 
85 90 95 

Asp Trp Cys Asn AT? Asn Xla Thr Ala Gin Lau Pro Thr Lya Gly Lys 
100 105 110 

Ser Phe Pro Gly Thr Gin Asn Phe Hia Lau Glu Val Gly Lau lie Xla 

115 120 125 
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Leu Ser Val V&l Phe Ala Val Cyi Leu Leu Ala Cys Leu Leu Oly Val 

130 135 140 

Ala Leu Arg Lya Phe Lya Arg Arg Asa Gin Glu Arg Leu Am Pro Arg 
145 150 155 160 

Asp V&1 Clu Tyr Cly Thr Ila Clu Cly Leu Ila Thr Thr Asn Val Cly 
165 170 175 

Asp Ser Thr Leu Ala Aap Leu uu Asp His ser Cys Thr Sar Gly Sar 
180 185 190 

Gly Ser Gly Lau Pro Pha Leu Val Gin Arg Thr Val Ala Arg Gin Ila 

195 200 205 

Thr Leu Lau Glu Cys Val Gly Lys Gly Arg Tyr Gly Glu Val Trp Arg 
210 215 220 

Cly Ser Trp Gin Gly Glu Asn Val Ala Val Lys Ila Pha Sar Sar Arg 
225 230 235 240 

Asp Glu Lys Sar Trp Pha Arg Glu Thr Glu Lau Tyr Asn Thr Val Kat 
245 250 255 

Leu Arg His Glu Asn 21a Leu Gly Phe lie Ala Ser Asp Met Thr Sar 

260 265 270 

Arg His Ser Sar Thr Gin Leu Trp Leu He Thr His Tyr His Clu Met 

275 280 285 

Cly Ser Lau Tyr Asp Tyr Leu Cln Leu Thr Thr Leu Asp Thr Val Ser 
290 295 300 

Cys Leu Arg lie Val Leu Ser He Ala Ssr Gly Leu Ala His Leu His 

305 310 315 320 

He Glu He Phe Gly Thr Gin Gly Lys Pro Ala lie Ala His Arg Asp 

325 330 335 

Leu Lys Ser Lys Asn He Leu V&l Lys Lys Asn Cly Gin Cys Cys He 

340 345 350 

Ala Asp Leu Gly Leu Ala Val Hat His Sar cln Ser Thr Asn Gin Leu 

355 360 365 

Asp Val Gly Asn Asn Pro Arg Val Gly Thr Lys Arg Tyr Het Ala Pro 
370 375 380 

Glu Val Leu Asp Glu Thr He Gin Val Asp Cys Phe Asp Ser Tyr Lys 
38S 390 395 400 

Arg Val Asp He Trp Ala Phe Gly Leu Val Leu Trp Glu Val Ala Arg 
405 410 415 

Arg Met V&l Ser Asn Gly He Val Glu Asp Tyr Lys Pro Pro Phe Tyr 
420 425 430 

Asp Val V&l Pro Asn Asp Pro Ser Phe Glu Asp Ket Arg Lys Val Val 
435 440 445 
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Cya V»l A«p Cln Cln Arg Fro A.a XL *«, Aan Arg Trp Ph. 6«r A«p 

455 450 r 

Pro Thr Leu Thr sar Leu Mt Lya Leu Met Lya ciu Cya Trp Tyr Oln 
465 470 475 

Aan Pro Sar Al* Arg Leu Thr Ala Leu Arg llm Lya Lya Thr Leu Thr 
485 4go 455 

Lya lie Aap Aan Sar Leu Aap Lya Lau Lye Thr Aap Cya 

500 505 

(2) INFORMATION FOR SEQ XD NO: Si 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENCTE: 2932 baae paira 

(B) TYPE: nueleie acid 

(C) STRANDBDNSSS: unknown 

(D) TOPOLOG*: liaaar 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(v) FRAGMENT TYPE: Internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hoco aapiana 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 310..190S 

(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 5: 

CCTCCGCCCC CAGCGCTGCA CGATGCGTTC CCTCCCGTCC GCACTTATCA AAATATGCAT 60 

CAGTTTAATA CTCTCTTGCA ATTCATGAGA TOSAAGCATA OCTCAAACCT CTTTOGAGAA 120 

AATCAGAAGT ACAGTTTTAT CTAOCCACAT CTTCCAGCAG TCCTAAGAAA GCACTGCGAG 180 

TTGAAGTCAT TGTCAAGTGC TTCCGATCTT TTACAAGAAA ATCTCACTGA ATCATAGTCA 240 

TTTAAATTCG TGAAGTAGCA AGACCAATTA TTAAACGTCA CACTACACAC GAAACATTAC 300 

AATTCAACA ATC ACT CAO CTA TAC ATT TAC ATC AGA TTA TTO CGA CCC am 
Mat Thr Cln Lau Tyr II. Tyr II. Arg Lau Leu Cly AU 

* 5 in 



10 

TAT TTO TTC ATC ATT TCT CGT GTT CAA CGA CAG AAT CTC CAT ACT ATC. toe 
Tyr Leu Pha II. II. S.r Arg Val Gin Cly Cln Aan 2J stl S5 96 
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CTT CAT GGC ACT CCG XTC AAA TCA CAC TCC CAC CAG AAA AAG TCA GAA 444 
Leu His Cly Thr Cly Met Lye Ser Asp Ser Asp Gin Lys Lys ser Glu 
30 35 40 45 

AAT GCA CTA ACC TTA CCA CCA CAG CAT ACC TTG CCT TTT TTA AAG TCC 492 
Asn Cly Val Thr Leu Ala Pro Clu Asp Thr Leu Pro Phe Leu Lys Cys 
50 55 60 

TAT TGC TCA CGC CAC TOT CCA CAT CAT CCT ATT AAT AAC ACA TCC ATA 540 
Tyr Cys Ser Gly Bis Cys Pro Asp Asp Ala lis Asn Asa Thr Cys lis 
65 70 75 

ACT AAT CCA CAT TCC TTT CCC ATC ATA CAA CAA GAT CAC CAC CCA CAA 566 
Thr Asn Gly Bis Cys Pha Alt 21s lis Clu Clu Asp Asp Gin Gly Clu 
60 85 90 

ACC ACA TTA CCT TCA CCG TCT ATC AAA TAT CAA CCA TCT CAT TTT CAC 636 
Thr Thr Leu Als Ssr Cly Cys Kst Lys Tyr Clu Cly Ssr Asp Phs Cln 
95 100 105 

TCC AAA CAT TCT CCA AAA CCC CAC CTA CCC CCC ACA ATA CAA TCT TCT 684 
Cys Lys Asp Ssr Pro Lys Als Cln Lau Arg Arg Thr lis Clu Cys Cys 
110 115 120 125 

CGC ACC AAT TTA TCT AAC CAC TAT TTC CAA CCC ACA CTC CCC CCT CTT 732 
Arg Thr Asn Leu Cys Asn Cln Tyr Lau Cln Pro Thr Lau Pro Pro Val 
130 135 140 

GTC ATA CCT CCG TTT TTT CAT CGC AGC ATT CCA TCC CTG CTT TTC CTC 780 
Val lie Gly Pro Pha Pha Asp Gly Sar I la Arg Trp Lau Val Lau Lau 
145 150 155 

ATT TCT ATG CCT CTC TGC ATA ATT CCT ATG ATC ATC TTC TCC ACC TCC 828 
lis Ser Met Ala Val Cys lie lie Ala Met lie lie Phe Ser Sar Cys 
160 165 170 

TTT TGT TAC AAA CAT TAT TCC AAG AGC ATC TCA AGC ACA CCT CCT TAC 876 
Phe Cys Tyr Lys His Tyr Cys Lys Ser lie Ssr Sar Arg Arg Arg Tyr 
175 180 185 

AAT CGT CAT TTG CAA CAG GAT CAA CCA TTT ATT CCA CTT CCA CAA TCA 924 
Asn Arg Asp Leu Clu Gin Asp Glu Ala Pha lie Pro Val Cly Glu Sar 
190 195 200 205 

CTA AAA CAC CTT ATT CAC CAC TCA CAA ACT TCT CCT ACT GGG TCT CCA 972 
Leu Lys Asp Leu 21a Asp Cln Sar Gin Sar Sar Gly Sar Gly Ser Gly 
210 215 220 

CTA CCT TTA TTC CTT CAG CCA ACT ATT CCC AAA CAC ATT CAG ATG CTC 1020 
Leu Pro Lau Leu Val Cln Arg Thr lie Ala Lys Cln lie Gin Bet Val 
225 230 235 

CGC CAA CTT GCT AAA CGC CCA TAT CCA CAA CTA TGC ATG GGC AAA TGC 1068 
Arg Cln Val Cly Lys Cly Arg Tyr Gly Clu Val Trp Bet Cly Lys Trp 

240 245 250 * 

CCT GGC CAA AAA CTG GCG CTG AAA CTA TTC TTT ACC ACT CAA CAA CCC 1116 
Arg Gly Clu Lys Val Ala Val Lys Val Pha Phe Thr Thr Clu Clu Ala 
255 260 265 
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AGC TOG TTT CCA GAA ACA GAA ATC TAC CAA ACT GTG C7A ATG CGC CAT 1164 
Ser Trp Phc Arg clu Thr Clu Xla Tyr Gin Thr Val Lau Kit Arg Bis 
270 275 280 285 

GAA AAC ATA CTT GOT TTC ATA GCC CCA GAC ATT AAA CCT ACA OCT TCC 1212 
Clu Am XI* Leu Cly Pha Xla Ala Ala Asp I la Lya Gly Thr Gly Ssr 
290 295 300 

TGG ACT CAG CTC TAT TTG ATT ACT CAT TAC CAT GAA AAT GCA TCT CTC 1260 
Trp Thr Gin Leu Tyr Leu I la Thr Asp Tyr His Glu Aan Gly sar Lau 

305 310 315 

TAT GAC TTC CTC AAA TCT CCT ACA CTC CAC ACC AGA GCC CTC CTT AAA 1308 
Tyr Asp Pha Lau Lya Cya Ala Thr Lau Aap Thr Arg Ala Lau Lau Lya 

320 325 330 

TTG GCT TAT TCA GCT GCC TCT GGT CTC TCC CAC CTC CAC ACA CAA ATT 1356 
Lau Ala Tyr Ser Ala Ala Cya Gly Lau Cya Bis Lau Hit Thr Clu 11a 
335 340 345 

TAT GGC ACC CAA GGA AAC CCC CCA ATT CCT CAT CCA CAC CTA AAC AGC 1404 
Tyr Cly Thr Gin Gly Lya Pro Ala Xla Ala His Arg Aap Lau Lya Sar 
350 355 360* 365 

AAA AAC ATC CTC ATC AAG AAA AAT COG ACT TCC TCC ATT GCT GAC CTC 1452 
Lys Aan Xla Leu Xla Lya Lya Aan Gly Sar Cya Cya Xla Ala Asp Lau 

370 375 380 

CCC CTT GCT GTT AAA TTC AAC AGT CAC ACA AAT GAA CTT CAT CTC CCC 1500 
Cly Leu Ala Val Lya Pha Aan Sar Asp Thr Aan Glu Val Asp Val Pro 
365 390 395 

TTG AAT ACC AGG GTC CGC ACC AAA CCC TAC ATC GCT CCC CAA CTG CTG 1548 
Leu Asn Thr Arg Val Cly Thr Lya Arg Tyr Mat Ala Pro Glu Val Lau 
400 405 410 

GAC CAA AGC CTC AAC AAA AAC CAC TTC CAG CCC TAC ATC ATC GCT CAC 1596 
Asp Clu Ser Leu Asn Lys Asn His Pha Gin Pro Tyr Xla Mat Ala Asp 
415 420 425 

ATC TAC AGC TTC GGC CTA ATC ATT TGG GAC ATC GCT CCT CGT TCT ATC 1644 
Xla Tyr Ser Pha Cly Leu Xla Xla Trp Clu Met Ala Arg Arg Cys Xla 
430 435 440 445 

ACA GCA GGG ATC CTC CAA GAA TAC CAA TTC CCA TAT TAC AAC ATC CTA 1692 
Thr Cly Cly Xla Val Clu Clu Tyr Cln Lau Pro Tyr Tyr Asn Mat Val 
450 455 460 

CCC AGT GAT CCG TCA TAC CAA GAT ATG CGT CAG GTT CTC TCT CTC AM 1740 
Pro ser Asp Pro sar Tyr Glu Asp Mat Arg Glu Val Val Cya Val Lys 
465 470 475 

CCT TTG CGC CCA ATT CTC TCT AAT CCC TCC AAC ACT GAT CAA TCT CTA 1788 
Arg Leu Arg Pro Xla Val Sar Asn Arg Trp Asn Sar Asp Clu Cys Lau 
480 485 490 

CCA CCA CTT TTC AAG CTA ATC TCA CAA TCC TGG CCC CAC AAT CCA GCC 1836 
Arg Ala Val Lau Lya Leu Met Ser Glu Cys Trp Ala His Asn Pro Ala 
495 500 505 
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TCC ACA CTC ACA OCA TTO ACA ATT AAO AXO ACQ CTT GCC AAO ATO CTT 1684 
Sar Arg Lau Thr Ala Lau Arg Ila Lya Lya Thr Lau Ala Lya Hat Val 

510 515 520 525 

GAA TCC CAA CAT OTA AAA ATC TCATOCTTAA ACCATCGCAG CACAAACTCT 1935 
Clu Sar Ola Aap Val Lya 21a 
530 

AGACTCCAAC AACTCTTTTT ACCCATCCCA TCGGTCCAAT TAGAGTGGAA TAAGGATGTT 1995 

AACTTCGTTC TCACACTCTT TCTTCACTAC CTCTTCACAG OCTOCTAATA TTAAACCTTT 2055 

CACTACTCTT ATTAGCATAC AAOCTCGCAA CTTCTAAACA CTTCATTCTT TATATATCGA 2115 

CACCTTTATT TTAAATGTCC TTTTTCATCC CTTTTTTTAA CTCCCTTTTT ATGAACTGCA 2175 

TCAAGACTTC AATCCTGATT ACTGTCTCCA CTCAAGCTCT GCGTACTGAA TT GCC T C TT C 2235 

ATAAAACGGT GCTTTCTGTG AAAGCCTTAA GAACATAAAT GAGCGCAGCA GACATGGAGA 2295 

AATAGACTTT GCCTTTTACC TGAGACATTC AGTTCCTTTG TATTCTACCT TTGTAAAACA 2355 

CCCTATAGAT GATGATGTGT TTGGGATACT CCTTATTTTA TGATAGTTTG TCCTGTGTCC 2415 

TTAGTCATCT CTCTGTCTCT CCATGCACAT CCACGCCGCC ATTCCTCTGC TGCCATTTCA 2475 

ATTAGAAGAA AATAATTTAT ATGCATGCAC AGGAAGATAT TGCTGGCCGG TGCTTTTCTG 2535 

CTTTAAAAAT CCAATATCTC ACCAAGATTC CCCAATCTCA TACAAGCCAT TTACTTTGCA 2595 

ACTGACATAG CTTCCCCACC AGCTTTATTT TTTAACATCA AAGCTGATGC CAAGGCCAAA 2655 

AGAAGTTTAA AGCATCTCTA AATTTCGACT GTTTTCCTTC AACCACCATT TTTTTTGTCG 2715 

TTATTATTTT TGTCACGGAA AGCATCCTCT CCAAAGTTGG AGCTTCTATT GCCATGAACC 2775 

ATGCTTACAA AGAAAGCACT TCTTATTGAA CTGAATTCCT GCATTTCATA GCAATCTAAC 2835 

TCCCTATAAC CATGTTCTAT ATTCTTTATT CTCACTAACT TTTAAAAGGG AAGTTATTTA 2895 

TATTTTGTCT ATAATGTGCT TTATTTGCAA ATCACCC 2932 

(2) INFORKATION FOR SEQ 10 HO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 532 amino acida 

(B) TYPE i amino acid 
(D) TOPOLOGY: linaar 

(li) MOLECULE TTFSi protain 

(xi) SEQUENCE DESCRIPTION: SEQ ZD NO: 6: 

Hat Thr Cln Lau Tyr Ila Tyr Zla Arg Lau Leu Cly Ala Tyr Lau Pha - 
1 5 io 15 

Ila Ila Ser Arg Val Cln Gly Cln Asn Lau Aap Sar Kat Lau Hia Civ 

20 25 30 
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Thr Cly Ktt Lys Ssr Asp Ssr Asp Cla Lys Lys Ssr Clu Asn Cly Val 
35 40 45 

Thr Lsu XI a Pro Clu Asp Thr Lsu Pro Phs Lsu Lys Cys Tyr Cys Ssr 

50 55 60 

Cly His Cys Pro Xsp Xsp XI* lis Xsn Xsn Thr Cys lis Thr Asn Gly 

65 70 75 B0 

His cys Phs Ala lis lis clu Clu Xsp Xsp Cln Gly Clu Thr Thr Lsu 
85 90 95 

Ala Ssr Gly Cys Kst Lys Tyr Clu Gly Ssr Xsp Phs Gin Cys Lys Asp 

100 105 110 

Ser Pro Lys Ala Gin Lsu Xrg Arg Thr 21s Glu Cys Cys Arg Thr Asn 
115 120 125 

Lsu Cys Asn Cln Tyr Lsu Gin Pro Thr Lsu Pro Pro Vsl Val lis Gly 
130 135 140 

Pro Phe Phs Xsp Gly Ssr lis Arg Trp Lsu Val Leu Lsu lis Ssr Kst 

145 150 155 160 

Xla Vsl Cys lis lis X1& Kst lis lis Phs Ssr Ssr Cys Phs Cys Tyr 

165 170 175 

Lys His Tyr Cys Lys Ssr lis Ser Ssr Xrg Xrg Xrg Tyr Xsn Xrg Xsp 

180 165 190 

Lsu Clu Cln Xfp Glu Xla Phe He Pro Val Gly Clu Ssr Lsu Lys Xsp 

195 200 205 

Lsu lis Xsp Cln Ssr Cln Ssr Ser Gly Ssr Gly Ssr Gly Lsu Pro Lsu 

210 215 220 

Lsu Val Cln Xrg Thr lis Xla Lys Gin lis Gin Kst Val Arg Gin Val 
225 230 235 240 

Cly Lys Gly Xrg Tyr Cly Clu Val Trp Ket Cly Lys Trp Xrg Gly Clu 
245 250 255 

Lys Val Xla Val Lys Val Phs Phs Thr Thr Glu Glu Xla Ssr Trp Phs 

260 265 270 

Xrg clu Thr Clu 21s Tyr Gin Thr Val Lsu Kst Xrg His Glu Xsn 21s 

275 280 285 

Leu cly Pho lis Xla Xla Xsp lis Lys Cly Thr Cly Ssr Trp Thr Gin 
290 295 300 

Lsu Tyr Leu lie Thr Xsp Tyr His Glu Xsn Gly Ssr Lsu Tyr Xsp Phs 
305 310 315 320 

Lsu Lys Cys Xla Thr Lsu Xsp Thr Xrg Xla Lsu Leu Lys Leu Xla Tyr 
325 330 335 

Ssr Xla Xla Cys Gly Lsu Cys His Lsu His Thr Clu 21s Tyr Cly Thr 
340 345 350 
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Gin Cly Lye Pro Ala lit Ala Hit Arty Asp Lau Lye far Lye Aan XI* 

355 350 365 

L«u Ila Lye Lye Aan Cly Sar Cya Cya lit Ala Aap Leu Cly Lau Ala 

370 375 380 

Val Lye Fhe Aen Sar Aip Thr Ajn Clu Val Aip Val Fro Lau Aan Thr 

385 390 395 400 

Arg Val Cly Thr Lye Arg Tyr Kat Ala Pro Clu Val Lau Aap Glu Sar 
405 410 415 

Lau Aan Lye Asn Hia Pha Gin Pro Tyr Xla Kat Ala Aap Xla Tyr Bar 
420 425 430 

Pha Gly Leu Xla Xla Trp Glu Kat Ala Arg Arg Cya Xla Thr Cly cly 
435 440 445 

Xla Val Glu Glu Tyr Cln Lau Pro Tyr Tyr Aan Kat Val Pro Sar Aap 
450 455 460 

Pro Ser Tyr Glu Aap Kat Arg Clu Val Val Cya Val Lye Arg Lau Arg 
465 470 475 480 

Pro Xla Val Sar Asn Arg Trp Aan Sar Aip Clu Cya Lau Arg Ala Val 
485 490 495 

Lau Lya Leu Kat Sar Clu Cya Trp Ala Hia Aan Pro Ala Sar Arg Lau 

500 505 510 

Thr Ala Leu Arg Zla Lya Lya Thr Lau Ala Lya Kat Val Glu Sar Cln 

515 520 525 

Aap Val Lys Xla 

530 

(2) INFORXATION FOR SEQ XD HO: 7: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 2333 baia paire 

<B) HPE; nucleic acid 

(C) STRANDEDNESS? unknown 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPE I CDKA 

(iii) HYPOTHETICAL t HO 

(iii) ANTI-SENSE t HO 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM t Homo i apian a 

(ix) FEATURE: 

(A) HAKE /KEY I CDS 

(B) LOCATION} 1..1515 



rnorrmiTf fucn 



WO'94/11502 PCT/GB93/02367 

51 

<xi) SXQUXNCX MSCRIPTIONt SZQ ZD KOt 7* 

ATG CCC GAG TOG GCC GGA GCC TCC TCC TTC TTC CCC CTT CTT GTC CTC 48 
Met Al4 Glu Sar Alt Gly Ala Ser Ser Pha Pha Pro Lau Val Val Lau 
1 5 10 15 

CTC CTC GCC GGC AGC GGC GGG TCC CGG CCC CGG GGG GTC CAG GCT CTC 96 
Leu Lau Ala Gly Str Gly Gly Ser Gly Pro Arg Gly Val Cln Ala Leu 
20 25 30 

CTC TCT GCC TCC ACC AGC TCC CTC CAG GCC AAC TAC ACQ TCT GAG ACA 144 
Leu Cya Ala Cya Thr Sar Cya Leu Gin Ala Aan Tyr Thr Cya Glu Thr 
35 40 45 

CAT GGG GCC TGC ATG GTT TCC TTT TTC AAT CTG CAT CGG ATG GAG CAC 192 
Aap Gly Ala Cya Mat Val Sar Pha Pha Aan Leu Aap Gly Mat Glu Hia 
50 55 60 

CAT CTG CGC ACC TCC ATC CCC AAA CTG GAG CTG GTC CCT GCC GGG AAG 240 
Hia Val Arg Thr Cya Ila Pro Lya Val Glu Lau Val Pro Ala Gly Lya 
65 70 75 80 

CCC TTC TAC TGC CTG AGC TCG CAG CAC CTG CGC AAC ACC CAC TCC TGC 288 
Pro Pha Tyr Cya Leu Ser Ser Glu Aap Lau Arg Aan Thr Hia Cya Cya 
85 90 95 

TAC ACT CAC TAC TCC AAC AGG ATC CAC TTG AGG GTG CCC AGT CGT CAC 336 
Tyr Thr Aap Tyr Cya Aan Arg Ila Aap Lau Arg Val Pro Sar Gly Hia 
100 105 no 

CTC AAG GAG CCT GAG CAC CCG TCC ATG TGG GGC CCG GTG GAG CTG GTA 384 
Lou Lya Glu Pro Glu Hia Pro Sar Mat Trp Gly Pro Val Glu Lau Val 
115 120 125 

GGC ATC ATC CCC GGC CCG GTC TTC CTC CTG TTC CTC ATC ATC ATC ATT 432 
Gly Ila Ila Ala Gly Pro Val Pha Lau Lau Pha Lau Ila Ila Ila Ila 
130 135 140 

CTT TTC CTT CTC ATT AAC TAT CAT CAG CCT GTC TAT CAC AAC CGC CAG 480 
Val Pha Leu Val Ila Aan Tyr Hia Cln Arg Val Tyr Hia Aan Arg Gin 
"5 150 155 160 

AGA CTG GAC ATG GAA CAT CCC TCA TCT CAC ATC TGT CTC TCC AAA CAC 528 
Arg Leu Aap Met Glu Aap Pro Ser Cya Glu Mat Cya Lau Sar Lya Aap 
165 170 175 

AAG ACG CTC CAG CAT CTT CTC TAC CAT CTC TCC ACC TCA CGG TCT GGC S76 
Lya Thr Leu Cln Aap Leu Val Tyr Aap Leu Sar Thr Sar Gly Sar Gly 
180 185 190 

TCA GGG TTA CCC CTC TTT CTC CAG CCC ACA CTG GCC CCA ACC ATC GTT 624 
Ser Gly Leu Pro Lau Pha Val Cln Arg Thr Val Ala Arg Thr Ila Val 
195 200 205 

TTA CAA CAG ATT ATT GGC AAG GGT CGG TTT GGG GAA GTA TGG CGG GGC 672 
Leu Gin Glu Ila Ila Gly Lya Gly Arg Pha Gly Glu Val Trp Arc Civ 
210 215 220 
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CCC TGG ACC GGT OCT CAT GTG OCT CTO AAA ATA TTC TCT TCT OCT CAA 720 
Arg Trp Arg Cly Gly Asp Val Ala Val Lys 11* Phs Ssr Ssr Arg Olu 
225 230 235 240 

CAA COG TCT TGG TTC AGO GAA OCA GAG ATA TAC GAG ACG GTC ATG CTG 768 
Glu Arg ssr Trp Phs Arg Glu Ala Glu lit Tyr Gin Thr Vtl Kst Lsu 
245 250 255 

CCC CAT GAA AAC ATC CTT CGA TTT ATT OCT OCT GAC AAT AAA CAT AAT 816 
Arg His Glu Asn lis Lea Gly Phs Zls Alt Ala Asp Asn Lys Asp Asa 
260 265 270 

GGC ACC TGG ACA GAG CTG TGG CTT CTT TCT GAC TAT CAT GAG CAC GGG 864 
Gly Thr Trp Thr Gin Lsu Trp Lsu Val Ssr Asp Tyr Bis Glu Bis Gly 
275 280 285 

TCC CTG TTT CAT TAT CTG AAC COG TAC ACA CTG ACA ATT GAG GGG ATG 912 
Ssr Lsu Fhs Asp Tyr Lsu Asn Arg Tyr Thr Val Thr lis Glu Gly Kst 
290 295 300 

ATT AAG CTG GCC TTC TCT CCT CCT AGT GGG CTG GCA CAC CTG CAC ATG 960 
lis Lys Leu Ala Lsu Ssr Ala Ala Ssr Gly Lsu Ala His Lsu fiis Kst 
305 310 315 320 

GAG ATC CTG GCC ACC CAA GGG AAG CCT OCA ATT OCT CAT CGA GAC TTA 1008 
Glu Zls Val Gly Thr Gin Gly Lys Pro Gly lis Ala fiis Arg Asp Lsu 
325 330 335 

AAG TCA AAG AAC ATT CTG GTG AAG AAA AAT GGC ATC TCT GCC ATA GCA 1056 
Lys Ser Lys Asn lis Lsu Val Lys Lys Asn Gly Kst Cys Ala lis Ala 
340 345 350 

GAC CTG GGC CTG CCT CTC CCT CAT CAT CCA CTC ACT CAC ACC ATT CAC 1104 
Asp Leu Gly Lsu Ala Val Arg His Asp Ala Val Thr Asp Thr lis Asp 
355 360 365 

ATT GCC CCC AAT CAG AGG CTG GGG ACC AAA CCA TAC ATG GCC CCT GAA 1152 
lis Ala Pro Asn Gin Arg Val Gly Thr Lys Arg Tyr Kst Ala Pro Glu 
370 375 380 

CTA CTT GAT GAA ACC ATT AAT ATG AAA CAC TTT GAC TCC TTT AAA TCT 1200 
Val Leu Asp Glu Thr lis Asn Kst Lys His Phs Asp Ssr Phs Lys Cys 
385 390 395 400 

OCT GAT ATT TAT GCC CTC GGG CTT GTA TAT TGG GAG ATT CCT CGA ACA 1248 
Ala Asp Zls Tyr Ala Lsu Cly Lsu Val Tyr Trp Clu Zls Ala Arg Arg 
405 410 415 

TCC AAT TCT CGA CGA CTC CAT CAA CAA TAT CAC CTC CCA TAT TAC CAC 1296 
Cys Asn Ser Gly Gly Val His Glu Glu Tyr Gin Lsu Pro Tyr Tyr Asp 
420 425 430 

TTA CTG CCC TCT GAC CCT TCC ATT GAG GAA ATC CCA AAG CTT CTA TCT 1344 
Leu Val Pro Ser Asp Pro Ser Zls Glu Glu Kst Arg Lys Val Val Cys 
435 440 445 

GAT CAG AAG CTG CCT CCC AAC ATC CCC AAC TGG TGG CAG AGT TAT GAG 1392 
Asp Gin Lys Lsu Arg Pro Asn Zls Pro Asn Trp Trp Cln Ssr Tyr Glu 
450 455 460 



irr rtirrr 



WO 94/1 1502 PCT/GBM/02367 

53 

CCA CTG CGG CTG ATC GGC AAO ATC ATC CGA GAG TCT TGG TAT CCC AAC 1440 
Ala L#u Arg Val Kit Cly Lya Mat M*t Arg Clu Cya Trp Tyr Ala Aan 
465 470 475 480 

GGC GCA GCC CGC CTC AOG GCC CTG CGC ATC AAG AAG ACC CTC TCC CAG 1488 
Cly Ala Ala Arg Leu Thr Ala Leu Arg Zla Lya Lya Thr Leu Ser Gin 
485 490 495 

CTC AGC CTC CAG CAA CAC CTC AAG ATC TAACTGCTCC CTCTCTCCAC 1535 
Lau Sar Val Gin Clu Aap Val Lya Zla 
500 505 

ACGGAGCTCC TGGCAGCCAG AACTACGCAC AGCTGCCGCG TTGAGCGTAC CATGGAGGCC 1595 

TACCTCTCGT TTCTCCCCAG CCCTCTCTCG CCAGGAGCCC TGGCCCGCAA GAGGGACAGA 1655 

CCCCGGGAGA GACTCCCTCA CTCCCATCTT CGGTTTCAGA CAGACACCTT TTCTATTTAC 1715 

CTCCTAATGG CATGGAGACT CTGACAGCCA ATTGTCTCGA CAACTCAGTG CCACACCTCG 1775 

AACTGGTTGT AGTGGGAAGT CCCGCGAAAC CCGGTGCATC TGGCACGTGG CCAGCAGCCA 1635 

TGACAGGGGC GCTTGGGAGG GGCCGGAGGA ACCGAGGTGT TCCCAGTGCT AAGCTGCCCT 1895 

GAGGGTTTCC TTCGGGGACC AGCCCACAGC ACACCAAGGT GGCCCGGAAG AACCAGAAGT 1955 

GCAGCCCCTC TCACAGGCAG CTCTGAGCCC CGCTTTCCCC TCCTCCCTGG GATGGACGC7 2015 

GCCGGGAGAC TCCCAGTCGA CACGGAATCT GCCGCTTT G T CTGTCCAGCC CTGTCTGCAT 2075 

GTGCCGAGGT GCGTCCCCCC TTCTGCCTGG TTCCTGCCAT GCCCTTACAC GTGCGTGTGA 2135 

CTCTGTGTGT GTCTCTCTAG GTGCGCACTT ACCTCCTTCA GCTTTCT C TC CATCTGCAGG 2195 

TCGGGGGTGT GGTCGTCATG CTGTCCCTGC TTGCTCGTGC CTCTTTTCAG TAGTGAGCAG 2255 

CATCTAGTTT CCCTGGTGCC CTTCCCTGGA CGTCTCTCCC TCCCCCAGAC CCCCTCATGC 2315 

CACAGTGGTA CTCTGTGT 2333 

<2> INFORMATION FOR SEQ ZD NO! fil 

(i) SEQUENCE CHARACTERISTICS l 

(A) LENGTH: 505 aaino acids 

(B) TYPEt amino acid 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protain 

(xi) SEQUENCE DESCRIPTION I SEQ ID NO J 8: 

Mat Ala Glu Sar Ala Cly Ala Ser Sar Pha Pha Pro Lau Val Val Lau 
1 5 10 15 

Leu Leu Ala Cly Sar Cly Cly Sar Cly Pro Arg Cly Val Cln Ala Lau 
20 25 30 
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Leu Cys Ala Cys Thr Sar Cys Lau Gin Ala Asn Tyr Thr Cys Glu Thr 
35 40 45 

Asp Gly Ala Cys Mat Val Str Pha Pha Am Leu Asp Gly Kat Glu Bis 

50 55 60 

His Val Arg Thr Cys lis Pro Lys Val Glu Leu Val Pro Ala Gly Lys 

65 70 75 80 

Pro Pha Tyr Cys Leu Sar Sar Glu Asp Lau Arg Asn Thr His Cys Cys 
85 90 95 

Tyr Thr Asp Tyr Cys Asn Arg I la Asp Lau Arg Val Pro Sar Gly Bis 

100 105 110 

Lau Lys Glu Pro Glu Els Pro Sar Kat Trp Gly Pro Val Glu Lau Val 
115 120 125 

Gly II* Ila Ala Gly Pro Val Pha Lau Lau Pha Lau Ila Zla Zla Zla 

130 135 140 

Val Pha Lau Val Zla Asn Tyr His Gin Arg Val Tyr His Asn Arg Gin 
145 150 155 160 

Arg Leu Asp Kat Glu Aip Pro Sar Cys Glu Kat Cys Lau Sar Lys Asp 

165 170 175 

Lys Thr Lau Gin Asp Lau Val Tyr Asp Lau Sar Thr Sar Gly Sar Gly 

180 185 190 

Sar Gly Lau Pro Lau Pha Val Gin Arg Thr Val Ala Arg Thr Zla Val 

195 200 205 

Leu Gin Glu Zla Zla Gly Lys Gly Arg Pha Gly Glu Val Trp Arg Gly 
210 215 220 

Arg Trp Arg Gly Gly Asp Val Ala Val Lys Zla Pha Sar Sar Arg Glu 

225 230 235 240 

Glu Arg Ser Trp Pha Arg Glu Ala Glu Zla Tyr Gin Thr Val Kat Lau 
245 250 255 

Arg His Glu Asn Zla Lau Gly Pha Zla Ala Ala Asp Asn Lys Asp Asn 

260 265 270 

Gly Thr Trp Thr Gin Lau Trp Lau Val Sar Asp Tyr His Glu Bis Gly 
275 280 285 

Ser Leu Pha Asp Tyr Lau Asn Arg Tyr Thr Val Thr Zla Glu Gly Kat 
290 295 300 

Zla Lys Leu Ala Lau Sar Ala Ala Sar Gly Lau Ala His Lau Bis Kat 

305 310 315 320 

Glu Zla Val Gly Thr Gin Gly Lys Pro Gly Zla Ala Bia Arg Aap Lau 
325 330 335 

Lys Ser Lys Asn Zla Lau Val Lys Lys Asn Gly Kat Cys Ala Zla Ala 
340 345 350 
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Asp Lau Gly Lau Ala Val Arg Bis Asp Ala Val Thr Asp Thr lis Asp 

355 360 365 

2 la Ala Pro Asn Gin Arg Val Gly Thr Lys Arg Tyr Kat Ala Pro Clu 

370 375 380 

Val Lau Asp Glu Thr I la Asn Mat Lya His Pha Aap Sar Pha Lys Cys 

385 390 395 400 

Ala Asp lla Tyr Ala Lau Gly Lau Val Tyr Trp Glu Ila Ala Arg Arg 
405 410 415 

cys Ann Sar Gly Gly Val His Glu Glu Tyr Gin Lau Pro Tyr Tyr Asp 
420 425 430 

Lau Val Pro Sar Asp Pro Sar Zla Glu Glu Kat Arg Lys Val Val Cys 
435 440 445 

Asp cln Lys Lau Arg Pro Asn Ila Pro Asn Trp Trp Gin Sar Tyr Glu 
450 455 460 

Ala Leu Arg Val Kat Gly Lys Kat Kat Arg Glu Cys Trp Tyr Ala Asn 
465 470 475 480 

Gly Ala Ala Arg Lau Thr Ala Lau Axg Zla Lya Lya Thr Lau Sar Gin 
485 490 495 

Leu ser Val Gin Glu Asp Val Lya Ila 

500 505 



(2) INFORMATION FOR SEQ ID NO* 9: 

(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH: 2308 bass pairs 

(B) TYPE: nuclaic acid 

(C) 5 HANDEDNESS : unknown 

(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(v) FRAGMENT TYPE: intarnal 

(vi) ORIGINAL SOURCE: 

(A) ORGAN I SK: Mousa 

(ix) FEATURE: 

(A) NAKE/KEY: CDS 

(B) LOCATION: 77.. 1585 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GGCGAGGCGA GGTTTCCTCO CCTGAGGCAG CGGCGCGGCC GGGCCCCGCC GGOCCAGAGG 



60 
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CGGTCGCCCC CGGACC ATC CAG COG GOO CTC OCT OCT CCC OCT CCC CCC 109 
Met Clu Ala Ala Val Ala Ala Pro Arg Pro Arg 
IS 10 

CTG CTC CTC CTC CTC CTC CCC CCC CCC CCC GOG COG CCC GOG CCC CTC IS? 
Leu Leu Leu Lau val Lau Ala Ala Ala Ala Ala Ala Ala Ala Ala Leu 
15 20 25 

CTC CCC CGC CCC ACC CCC TTA CAG TCT TTC TGC CAC CTC TCT ACA AAA 205 
Lou Pro Cly Ala Thr Ala Lau Gin Cya Phe Cya fiia Lau Cya Thr Lya 
30 35 40 

GAC AAT TTT ACT TCT CTG ACA GAT GGG CTC TGC TTT CTC TCT CTC ACA 253 
Asp Asn Pha Thr Cya Val Thr Asp Gly Lau Cya Pha Val Sar Val Thr 
45 50 55 

CAG ACC ACA GAC AAA GTT ATA CAC AAC ACC ATG TCT ATA GCT CAA ATT 301 
Glu Thr Thr Asp Lya Val Zla fiia Aan Sar Mat Cya Zla Ala Glu Ila 

60 65 70 75 

GAC TTA ATT CCT CGA GAT ACG CCG TTT GTA TCT GCA CCC TCT TCA AAA 349 
Asp Leu Ila Pro Arg Asp Arg Pro Pha Val Cya Ala Pro Sar Ser Lya 
SO 85 90 

ACT GGG TCT CTG ACT ACA ACA TAT TGC TCC AAT CAG CAC CAT TGC AAT 397 
Thr Gly Ser Val Thr Thr Thr Tyr Cya Cya Aan Gin Asp His Cya Aan 
95 100 105 

AAA ATA GAA CTT CCA ACT ACT GTA AAG TCA TCA CCT CGC CTT GGT CCT 445 
Lys lie Glu Leu Pro Thr Thr Val Lys Ser Ser Pro Gly Leu Gly Pro 
110 115 120 

CTG GAA CTG CCA GCT CTC ATT GCT GCA CCA GTG TGC TTC CTC TGC ATC 493 
Val Glu Leu Ala Ala Val Ila Ala Gly Pro Val Cya Pha Val Cya Xla 
125 130 135 

TCA CTC ATG TTG ATG CTC TAT ATC TCC CAC AAC CGC ACT CTC ATT CAC 541 
Ser Leu Met Leu Met Val Tyr Ila Cys His Asn Arg Thr Val Ila His 
140 145 150 155 

CAT CCA CTC CCA AAT CAA CAC CAC CCT TCA TTA CAT CCC CCT TTT ATT 589 
His Arg Val Pro Asn Glu Glu Asp Pro Sar Leu Asp Arg Pro Pha Ila 
160 165 170 

TCA CAC CCT ACT ACG TTC AAA CAC TTA ATT TAT CAT ATC ACA ACG TCA 637 
Ser Glu Gly Thr Thr Leu Lys Asp Leu Ila Tyr Asp Met Thr Thr Ser 
175 180 185 

GCT TCT CCC TCA GGT TTA CCA TTC CTT CTT CAC ACA ACA ATT CCG ACA £85 
Gly Ser Gly Ser Gly Leu Pro Leu Leu Val Gin Arg Thr Ila Ala Arg 
190 195 200 

ACT ATT CTC TTA CAA CAA ACC ATT CGC AAA GCT CCA TTT CGA GAA CTT 733 
Thr lie Val Leu Gin Glu Ser lie Gly Lya Gly Arg phe Gly Glu Val 
205 210 21S 

TCC ACA CCA AAC TGG CCG CCA CAA CAA CTT GCT GTT AAC ATA TTC TCC 781 
Trp Arg Gly Lys Trp Arg Gly Glu Glu Val Ala Val Lya lie Phe Sar 
220 225 230 235 
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TCT ACA CAA CAA CGT TCG TGG TTC CCT CAC CCA CAC ATT TAT CAA ACT 629 
Ser Arg Clu Clu Arg Ser Trp Phe Arg Clu Ala Clu He Tyr Gin Thr 
240 245 250 

CTA ATC TTA CGT CAT CAA AAC ATC CTC CCA TTT ATA CCA CCA CAC AAT 877 
Val Met Leu Arg His Clu Am lit Leu Cly Phe He Ala Ala Asp Asa 

255 260 265 

AAA CAC AAT CGT ACT TCC ACT CAC CTC TGC TTG CTC TCA CAT TAT CAT 925 
Lys Asp Asn Cly Thr Trp Thr Cln Leu Trp Leu Val Sar Asp Tyr His 
270 275 280 

GAG CAT CGA TCC CTT TTT CAT TAC TTA AAC AGA TAC ACA CTT ACT CTC 373 
Clu His Gly Sar Leu Phe Asp Tyr Leu Asn Arg Tyr Thr Val Thr Val 
285 290 295 

CAA CGA ATC ATA AAA CTT OCT CTC TCC ACC CCC ACC CGT CTT GCC CAT 1021 
Clu Cly Hat lis Lys Leu Ala Leu Sar Thr Ala Sar Gly Leu Ala Bis 
300 305 310 315 

CTT CAC ATC CAC ATT CTT CGT ACC CAA CCA AAC CCA CCC ATT CCT CAT 1069 
Leu His Ket Glu lie Val Cly Thr Cln Cly Lys Pro Ala He Ala His 
320 325 330 

AGA GAT TTG AAA TCA AAG AAT ATC TTG CTA AAG AAG AAT GGA ACT TGC 1117 
Arg Asp Leu Lys Ser Lys Asn lie Leu Val Lys Lys Asn Gly Thr Cys 
335 340 345 

TCT ATT CCA CAC TTA CCA CTC CCA CTA ACA CAT CAT TCA CCC ACA CAT 1165 
Cvs He Ala Asp Leu Gly Leu All Val Arg His Asp Ser Ala Thr Asp 
350 355 360 

ACC ATT GAT ATT CCT CCA AAC CAC AGA CTC CGA ACA AAA AGG TAC ATC 1213 
Thr He Asp He Ala Pro Asn His Arg Val Cly Thr Lys Arg Tyr Met 
365 370 375 

CCC CCT GAA CTT CTC CAT CAT TCC ATA AAT ATC AAA CAT TTT CAA TCC 1261 
Ala Pro Glu Val Leu Asp Asp ser Ha Asn Het Lys His Phe Glu Ser 
360 385 390 395 

TTC AAA CGT GCT GAC ATC TAT CCA ATG CGC TTA CTA TTC TGC GAA ATT 1309 
Phe Lys Arg Ala Asp Zle Tyr Ala Het Gly Leu Val Phe Trp Clu He 
400 405 410 

CCT CGA CCA TCT TCC ATT GGT CGA ATT CAT CAA CAT TAC CAA CTC CCT 1357 
Ala Arg Arg Cys Ser He Cly Cly He His Clu Asp Tyr Cln Leu Pro 
415 420 425 

TAT TAT CAT CTT CTA CCT TCT CAC CCA TCA CTT CAA CAA ATC ACA AAA 1405 
Tyr Tyr Asp Leu Val Pro Ser Asp Pro Ser Val Clu Glu Ket Arg Lye 
430 435 440 

CTT GTT TCT GAA CAC AAC TTA AGG CCA AAT ATC CCA AAC AGA TCC CAC 1453 
Val Val Cys Clu Cln Lys Leu Arg Pro Asn Zle Pro Asn Arg Trp Gin 
445 450 455 

ACC TGT CAA CCC TTG AGA CTA ATG GCT AAA ATT ATG AGA CAA TGT TCC 1501 
Ser Cvs Glu Ala Leu Arg Val Het Ala Lys Zle Kst Arg Glu Cys Trp 
460 ~ 465 470 475 
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TAT OCC AAT CCA OCA OCT ACG CTT ACA OCA TTC CGC ATT AAG AAA ACA 1549 
Tyr Alt Aan Gly Ala Ala Arg Leu Thr Ala Lau Arg 21a Lya Lya Thr 
480 485 490 



TTA TCC CAA CTC ACT CAA CAC CAA CCC ATC AAA ATC TAATTCTACA 1595 
Lau Sar Gin Lau Sar Gin Gin Glu Gly Ila Lya Mat 
495 500 



GCTTTGCCTG 


AACTCTCCTT 


TTTTCTTCAG ATCTGCTCCT 


GGGTTTTAAT 


TTCGGAGGTC 


1655 


AGTTGTTCTA 


CCTCACTGAG 


AGGGAACAGA AGGATATTGC 


TTCCTTTTGC 


AGCAGTGTAA 


171S 


TAAACTCAAT 


TAAAAACTTC 


CCAGCATTTC TTTCGACCCA 


GGAAAGAGCC 


ATCTGGGTCC 


1775 


TTTCTCTGCA 


CTATGAACGC 


TTCTTTCCCA GGACAGAAAA 


TGTCTAGTCT 


ACCTTTATTT 


1635 


TTTATTAACA 


AAACTTGTTT 


TTTAAAAAGA TGATTGCTGG 


TCTTAACTTT 


AGGTAACTCT 


1695 


CCTGTGCTCG 


ACATCATCTT 


TAAGGGCAAA GGAGTTGGAT 


TGCTCAATTA 


GAATGAAAGA 


1955 


TGTCTTATTA 


CTAAAGAAAC 


TGATTTACTC CTCGTTAGTA 


CATTCTWGA 


GGATTCTGAA 


2015 


CCACTAGAGT 


TTCCTTGATT 


CAGACTTTCA ATGTACTCTT 


CTATAGTTTT 


TCACCATCTT 


2075 


AAAACTAACA 


CTTATAAAAC 


TCTTATCTTC AGTCTAAAAA 


TGACCTCATA 


TAGTAGTGAG 


2135 


GAACATAATT 


CATGCAATTC 


TATTTTGTAT ACTATTATTG 


TTCTTTCACT 


TATTCAGAAC 


2195 


ATTACATGCC 


TTCAAAATGC 


GATTCTACTA TACCAGTAAG 


TGCCACTTCT 


GTGTCTTTCT 


2255 


AATGCAAATC 


AGTAGAATTC 


CTGAAAGTCT CTATGTTAAA 


ACCTATACTG 


TTT 


2308 



(2) INEORKATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 503 aadno acida 

(B) TYPE: amino acid 
(D) TOPOLOGY: linaax 

(il) MOLECULE TYPE: protain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Hat Glu Ala Ala Val Ala Ala Pro Arg Pro Arg Lau Lau Lau Lau Val 

15 10 15 

Leu Ala Ala Ala Ala Ala Ala Ala Ala Ala Lau Lau Pro Gly Ala Thr 
20 25 30 

Ala Leu Gin Cya Pha Cya Hia Leu Cya Thr Lya Aap Aan Pha Thr Cya 
35 40 45 

Val Thr Aap Gly Leu Cya Pha Val Sar Val Thr Glu Thr Thr Asp Lya 

50 55 60 

Val He Hia Aan Ser Ket Cya He Ala Glu Ila Aap Lau Ila Pro Arg 
65 70 75 80 
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Asp Arg Pro Phe Val Cys Ala Pro Sir Ser Lys Thr Cly Ser Val Thr 

85 90 95 

Thr Thr Tyr Cys Cys Asn Gin Asp Bis Cys Asn Lys lis Clu Leu Pro 

100 105 110 

Thr Thr Val Lys Ssr Ser Pro Cly Leu Gly Pro Val Clu Leu Ala Ala 
115 120 125 

Val lie Ala Cly Pro Val Cya Phe Val Cys lis Ser Leu Met Leu Mat 

130 135 140 

Val Tyr lis Cys His Asn Arg Thr Val Ila Bis His Arg Val Pro Asn 
145 150 155 160 

Clu Clu Asp Pro Ssr Leu Asp Arg Pro Phs Xla Ssr Clu Cly Thr Thr 
165 170 175 

Leu Lys Asp Leu He Tyr Asp Met Thr Thr Ser Gly Ser Gly Ser Cly 

180 185 190 

Leu Pro Leu Leu Val Gin Arg Thr He Ala Arg Thr Zle Val Leu Gin 
195 200 * 205 

Glu Ser He Gly Lys Gly Arg Phe Gly Clu Val Trp Arg Gly Lys Trp 
210 21S 220 

Arg Gly Glu Glu Val Ala Val Lys He Phe Ser Ser Arg Glu Glu Arg 

225 230 235 240 

Ser Trp Phe Arg Glu Ala Glu He Tyr Cln Thr Val Met Leu Arg His 
245 250 255 

Glu Asn He Leu Gly Phe He Ala Ala Asp Asn Lys Asp Asn Gly Thr 

260 265 270 

Trp Thr Gin Leu Trp Leu Val Ser Asp Tyr His Clu His Cly Ser Leu 
275 280 285 

Phe Aep Tyr Leu Asn Arg Tyr Thr Val Thr Vel Glu Cly Met Zle Lys 

290 295 300 

Leu Ala Leu Ser Thr Ala Ser Gly Leu Ala His Leu His Met Glu He 

305 310 315 320 

Val Gly Thr Gin Cly Lys Pro Ala He Ala His Arg Asp Leu Lys Ser 
325 330 335 

Lys Asn He Leu Val Lys Lys Asn Gly Thr Cys Cys He Ala Asp Leu 

340 345 350 

Cly Leu Ala Val Arg His Asp Ser Ala Thr Asp Thr He Asp He Ala 
355 360 365 

Pro Asn His Arg Val Cly Thr Lys Arg Tyr Met Ala Pro Clu Val Leu 
370 375 380 

Asp Asp Ser He Asn Met Lys His Phe Clu Ser Phe Lys Arg Ala Asp 

385 390 395 400 
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lie Tyr Ala Met Cly Leu Val Fhe Trp Clu lie Ala Arg Arg Cya *ar 
7 405 410 415 

lie Cly Cly !!• Hie Clu Asp Tyr Cla Leu Pro Tyr Tyr Aap Leu Val 
420 425 430 

Pro ser Asp Pro Ser Val Clu Clu Mat Arg Lye Val Val Cya Clu Cln 
435 440 445 

Lye Leu Arg Pro Aen lie Pro Ain Arg Trp Cln Ser Cya Clu Ala Lau 
450 455 460 

Arg Val Met Ala Lye lie Mat Arg Clu Cya Trp Tyr Ala A«n Cly Ala 
465 470 475 480 

Ala Arg Leu Thr Ala Leu Arg Ila Lya Lye Thr Leu Ser Gin Lau Sar 
485 490 495 

Cln Cln Clu Cly Ila Lya Mat 

500 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1922 baaa paire 

(B) TYPE: nuclaic acid 

(C) STRAND ED NESS : unknown 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(v) F RACKS NT TTFE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouae 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 241. .1746 

(xi) SEQUENCE DESCRIPTION: SEQ ID KO: 11: 

GAGAGCACAG CCCTTCCCAG TCCCCGGAGC CGCCCCCCCA CGCCCCCATC ATCAAGACCT 60 

TTTCCCCGGC CCCACACGCC CTCTGCACCT CAGACCCCCC CCCCCTCCGC AAGCACAGGC 120 

GGGGGTCGAG TCGCCCTGTC CAAAGGCCTC AATCTAAACA ATCTTGATTC CTGTTCCCCC 180 

CTGGCGGGAC CCTCAATGGC AGGAAATCTC ACCACATCTC TTCTCCTATC TCCAAGGACC 240 

ATG ACC TTG GGG AGC TTC AGA AGG GGC CTT TTC ATG CTC TCC CTC CCC 288 
Met Thr Leu Cly Sar Pha Arg Arg Cly Lau Lau Mat Lau Sar Val Ala 
1 5 10 15 
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TTC GCC CTA ACC CAG CGG ACA CTT GOG AAO CCT TCC AAG CTG CTG AAC 
Leu Cly Leu Thr Gin Gly Arg Leu Alt Lye Pro Ser Lye Leu Val Aen 
20 25 30 

TGC ACT TCT GAG AGO CCA CAC TCC AAG AGA CCA TTC TCC CAG CGG TCA 
Cye Thr Cye Glu Ser Pro Hie Cye Lye Arg Pro Phe Cye Gin Gly Ser 
" 40 45 

TGG TGC ACA GTG CTG CTG CTT CCA CAG CAG CCC AGG CAC CCC CAG CTC 
Trp Cye Thr Val VaI Leu Val Arg Clu Gin Gly Arg file Pro Gin Val 
50 55 60 



TAT CGG GGC TCT CGG AGC CTC AAC CAG CAG CTC TGC TTG CCA CGT CCC 
Tyr Arg Cly Cye Cly Ser Leu Aen Gin Glu Leu Cye Leu Cly Arc Pro 
65 70 75 80 



ACG CAG TTT CTG AAC CAT CAC TCC TGC TAT AGA TCC TTC TGC AAC CAC 
Thr Glu Phe Leu Aen Hie Hie Cye Cye Tyr Arg Ser Phe Cve Aen Hie 
85 90 * 95 

AAC CTG TCT CTG ATC CTG CAG GCC ACC CAA ACT CCT TCG CAC CAG CCA 
Aen Val Ser Leu Met Leu Glu Ala Thr Gin Thr Pro Ser Clu Clu Pro 
100 105 no 

CAA CTT CAT CCC CAT CTC CCT CTC ATC CTC CGT CCT CTG CTG CCC TTC 
Glu Val Asp Ala Hie Leu Pro Leu He Leu Cly Pro Val Leu Ala Leu 
115 120 125 



336 



384 



432 



480 



528 



576 



624 



CCG CTC CTC CTG GCC CTG CCT CCT CTC GGC TTG TGG CGT CTC CGG CGC 672 
Pro Val Leu Val Ala Leu Cly Ala Leu Cly Leu Trp Arg Val Arg Arg 
130 135 140 

AGG CAC CAG AAG CAG CGG CAT TTC CAC ACT CAC CTG GGC GAG TCC AGT 720 
Arg Gin Glu Lye Gin Arg Aep Leu Hie Ser Aep Leu Gly Clu Ser Ser 
145 150 155 160 

CTC ATC CTG AAG CCA TCT CAA CAG CCA CAC AGC ATG TTG CGC CAC TTC 768 
Leu lie Leu Lye Ala Ser Glu Gin Ala Aep Ser Met Leu Gly Aep Phe 
165 170 175 

CTC GAC AGC CAC TCT ACC ACC CGC ACC CGC TCG GGC CTC CCC TTC TTG 816 
Leu Asp Ser Aep Cye Thr Thr Gly Ser Cly Ser Cly Leu Pro Phe Leu 
180 185 190 

CTC CAG AGC ACC CTA CCT CCC CAC CTT CCG CTG CTA CAG TCT CTC CCA 864 
Val Gin Arg Thr Val Ala Arg Gin Val Ala Leu Val Clu Cye Val Cly 
195 200 205 

AAG GGC CGA TAT GGC GAG CTG TGG CCC CCT TCC TCC CAT CGC CAA AGC " "912 
Lys Gly Arg Tyr Cly Clu Val Trp Arg Cly Ser Trp Hie Cly Clu Ser 
210 215 220 

CTG CCG CTC AAC ATT TTC TCC TCA CCA CAT CAG CAG TCC TCC TTC CCC 960 
Val Ala Val Lye lie Phe Ser Ser Arg Aep Clu Gin Ser Trp Phe Aro 
225 230 235 240 

GAG ACG GAG ATC TAC AAC ACA CTT CTG CTT ACA CAC GAC AAC ATC CTA 1008 

Glu Thr Glu lie Tyr Aen Thr Val Leu Leu Arg Hie Asp Aen He Leu 
245 250 255 
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CGC TTC ATC CCC TCC CAC ATG ACT TCC CGG AAC TOO ACC ACS CAO CTC 1056 
Gly Phe lie Alt Ser Asp Mat Thr Ser Arg Asn Ser Ser Thr Gin Leu 
260 265 270 

TOG CTC ATC ACC CAC TAC CAT CAA CAC CGC TCC CTC TAT CAC TTT CTG 1104 
Trp Leu !!• Thr Hit Tyr His Glu His Gly Ser Leu Tyr Asp Phe Leu 
275 280 285 

CAG AGG CAG ACG CTC GAG CCC GAG TTG GCC CTG AGG CTA GCT GTG TCC 1152 

Gin Arg Gin Thr Leu Glu Fro Gin Leu Ala Leu Arg Leu Ala Val Ssr 

290 295 300 

CCG GCC TCC GGC CTG GCG CAC CTA CAT GTG GAG ATC TTT GGC ACT CAA 1200 
Pro Ala Cys Gly Leu Ala His Leu His Val Glu lie Phe Gly Thr Gin 
305 310 315 320 

GGC AAA CCA GCC ATT CCC CAT CGT GAC CTC AAG AGT CGC AAT CTG CTG 1248 
Gly Lys Pro Ala lie Ala Bis Arg Asp Leu Lys Ser Arg Asn Val Leu 
325 330 335 

GTC AAG ACT AAC TTG CAG TCT TCC ATT CCA GAC CTG CGA CTG GCT CTC 1296 
Val Lys Ser Asn Leu Cln Cys Cys lie Ala Asp Leu Gly Leu Ala Val 
340 345 350 

ATG CAC TCA CAA AGC AAC CAG TAC CTC GAT ATC GGC AAC ACA CCC CGA 1344 
Het His Ser Gin Ser Asn Glu Tyr Leu Asp lie Gly Asn Thr Pro Arg 
355 360 365 

GTG CGT ACC AAA ACA TAC ATG CCA CCC CAG GTG CTG CAT GAG CAC ATC 1392 
Val Gly Thr Lys Arg Tyr Met Ala Pro Glu Val Leu Asp Glu His Zle 
370 375 380 

CGC ACA CAC TCC TTT CAC TCC TAC AAC TGG ACA GAC ATC TGG CCC TTT 1440 
Arg Thr Asp Cys Phe Glu Ser Tyr Lys Trp Thr Asp Zle Trp Ala Phe 
385 390 395 400 

CGC CTA CTC CTA TGG GAG ATC CCC CCC CCG ACC ATC ATC AAT CGC ATT 1488 
Gly Leu Val Leu Trp Glu lie Ala Arg Arg Thr Zlo lie Asn Gly lie 
405 410 415 

GTG GAG GAT TAC AGG CCA CCT TTC TAT GAC ATG CTA CCC AAT CAC CCC 1536 
Val Glu Asp Tyr Arg Pro Pro Phe Tyr Asp Met Val Pro Asn Asp Pro 
420 425 430 

AGT TTT GAG CAC ATG AAA AAG CTG CTC TGC CTT CAC CAG CAC ACA CCC 1584 
Ser Phe Glu Asp Met Lys Lys Val Val Cys Val Asp Gin Gin Thr Pro 
435 440 445 

ACC ATC CCT AAC CGG CTC CCT CCA CAT CCC CTC CTC TCC CCC CTG GCC 1632 
Thr lie Pro Asn Arg Leu Ala Ala Asp Pro Val Leu Ser Gly Leu Ala 
450 455 460 

CAG ATG ATG AGA CAG TCC TGG TAC CCC AAC CCC TCT CCT CCC CTC ACC 1680 
Gin Met Met Arg Glu Cys Trp Tyr Pro Asn Pro Ser Ala Arg Leu Thr 
465 470 475 480 

CCA CTG CCC ATA AAG AAG ACA TTC CAC AAG CTC AGT CAC AAT CCA GAG 1728 
Ala Leu Arg He Lys Lvs Thr Leu Gin Lys Leu Ser His Asn Pro Glu 
485 ~ 490 495 
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AAG CCC AAA CTO ATT CAC TAGCCCAGGG CCACCACCCT TCCTCT G CCT 1776 
Lya Pro Lye Val He Bis 
500 

AAAGTGTCTC CTGGGGAMA ACACATACCC TGTCTGGGTA CAGGGAGTGA AGACACTCTG 1836 

CACCCTGCCC TCTGTGTCCC TOCTCAGCTT GCTCCCAGCC CATCCAGCCA AAAATACAGC 1896 

TCAGCTGAAA TTCAAAAAAA AAAAAA 1922 

(2) INFORMATION FOR SEQ ZD NO: 12 x 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 502 Amino acids 

(B) TYPE: anino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Thr Lou Gly Ser Pha Arg Arg Gly Leu' Lou Mat Lau Sar Val Ala 

15 10 15 

Leu Gly Leu Thr Gin Gly Arg Lau Ala Lya Pro Sar Lya Leu Val Am 

20 25 30 

Cya Thr Cya Glu Ser Pro Hia Cya Lya Arg Pro Phe Cya Gin Gly Ser 
35 40 45 

Trp Cya Thr Val Val Lou Val Arg Glu Gin Gly Arg Hia Pro Gin Val 
50 55 60 

Tyr Arg Gly Cya Cly Ser Lou Aan Gin Glu Lau Cya Lau Gly Arg Pro 
65 70 75 80 

Thr Glu Phe Leu Asn Hia Hia Cya Cya Tyr Arg Ser Pha Cya Aan Hia 

85 90 95 

Asn V&l Ser Lau Mat Lou Glu Ala Thr Gin Thr Pro Ser Glu Glu Pro 
100 105 110 

Glu Val Asp Ala Hia Leu Pro Leu lie Lou Gly Pro Val Leu Ala Leu 

115 120 125 

Pro Val Leu Val Ala Leu Gly Ala Leu Gly Lau Trp Arg Val Arg Arg 
130 135 140 

Arg Gin Glu Lya Gin Arg Asp Lou Hia Sar Aap Leu Gly Glu Ser Ser 
145 150 155 160 

Leu lie Leu Lya Ala Ser Glu Gin Ala Aap Ser Met Leu Gly Aap Phe 
165 170 175 

Leu Asp Ser Aap Cya Thr Thr Gly Sor Gly Sar Gly Leu Pro Pha Lau 
180 185 190 
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Val Gin Arg Thr Val Ala Arg Gin Val Ala Leu Val Clu Cyt Val Gly 

19S 200 205 

Lya Gly Arg Tyr Gly Clu Val Trp Arg Oly Bar Trp His Gly Glu Sar 
210 215 220 

Val Ala Val Lys lit Pha Sar Ser Arg Atp Glu Gin Sar Trp Pha Arg 

225 230 235 240 

Clu Thr Glu Zla Tyr Acn Thr Val Leu Leu Arg Hia Aap Aan I la Lau 
245 250 255 

Gly Pha Ila Ala Sar Aap Met Thr Sar Arg Aan Sar Sar Thr Gin Lau 

260 265 270 

Trp Leu Ila Thr His Tyr Hia Clu Hia Gly Sar Lau Tyr Asp Pha Lau 
275 280 285 

Gin Arg Gin Thr Lau Glu Pro Gin Lau Ala Lau Arg Lau Ala Val Sar 

290 295 300 

Pro Ala Cys Gly Leu Ala His Leu His Val Glu Ila Pha Gly Thr Gin 

305 310 315 320 

Gly Lys Pro Ala Ila Ala Hia Arg Asp Lau Lys Sar Arg Asn Val Lau 

325 330 335 

Val Lys Sar Asn Lau Gin Cys Cys Ila Ala Asp Lau Gly Lau Ala Val 

340 345 350 

Met His Ser Gin Ser Asn Glu Tyr Lau Asp Ila Gly Asn Thr Pro Arg 

355 360 365 

Val Gly Thr Lys Arg Tyr Met Ala Pro Glu Val Lau Asp Clu Hia Ila 
370 375 380 

Arg Thr Asp Cys Pha Glu Sar Tyr Lya Trp Thr Asp Ila Tip Ala Pha 

385 390 395 400 

Gly Leu Val Lou Trp Clu Ila Ala Arg Arg Thr Ila He Aan Gly He 
405 410 415 

Val Glu Asp Tyr Arg Pro Pro Pha Tyr Asp Kat Val Pro Aan Aap Pro 
420 425 430 

Ser Phe Glu Asp Met Lya Lya Val Val Cys Val Asp Gin Gin Thr Pro 
435 440 445 

Thr He Pro Asn Arg Leu Ala Ala Asp Pro Val Lau Ser Gly Leu Ala 
450 455 460 

Gin Met Met Arg Glu Cys Trp Tyr Pro Asn Pro Sar Ala Arg Lau Thr 
465 470 475 480 

Ala Leu Arg Ila Lya Lys Thr Leu Gin Lys Leu Ser His Asn Pro Glu 
485 490 495 

Lys Pro Lys Val He His 

500 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2070 baee pairs 
(6) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY* linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouee 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 217.. 1812 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



ATTCATCAGA 


TGGAAGCATA 


GGTCAAAGCT 


GTTCGCAGAA 


ATTGCAACTA CAGTTTTATC 


60 


TACCCACATC 


TCTGAGAATT 


CTGAAGAAAG 


CAGCAGCTGA 


AACTCATTCC CAAGTCATTT 


120 


TGTTCTGTAA 


GGAAGCCTCC 


CTCATTCACT 


TACACCACTG 


AGACACCAGG ACCAGTCATT 


180 


CAAAGGGCCG 


TGTACAGGAC 


GCGTGCCAAT 


CAGACA ATG 
Met 
1 


ACT CAC CTA TAC ACT 
Thr Gin L«u Tyr Thr 
5 


234 



TAC ATC AGA TTA CTG CCA CCC TGT CTG TTC ATC ATT TCT CAT GTT CAA 282 
Tyr He Arg Leu Leu Gly Ala Cya Leu Pha lie lie Ser Hit Val Gin 

10 15 20 

GGG CAG AAT CTA GAT AGT ATG CTC CAT GCC ACT CGT ATG AAA TCA CAC 330 
Gly Gin Aan Leu Asp Ser Met Leu Hia Gly Thr Cly Met Lya Sar Asp 
25 30 35 

TTG GAC CAG AAC AAG CCA GAA AAT CCA CTG ACT TTA CCA CCA CAG CAT 378 
Leu Aap Gin Lya Lyi Pro Clu Aan Cly Val Thr Leu Ala Pro Glu Aap 
40 45 50 

ACC TTG CCT TTC TTA AAC TGC TAT TCC TCA GGA CAC TGC CCA CAT CAT : " 426 
Thr Leu Pro Pha Leu Lya Cya Tyr Cya Sar Cly Hia Cya Pro Aap Aap 
55 60 65 70 

CCT ATT AAT AAC ACA TGC ATA ACT AAT CCC CAT TCC TTT CCC ATT ATA 474 
Ala He Aen Aan Thr Cya Ha Thr Aen Cly Kia Cya Pha Ala Ha Ha 
75 80 85 

CAA GAA GAT CAT CAG CCA CAA ACC ACA TTA ACT TCT GGG TCT ATG AAG 522 
Glu Glu Asp Aap Gin Gly Glu Thr Thr Leu Thr Ser Gly Cya Met Lya 
90 95 100 
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TXT CAA CCC TCT CXT TTT CAA TOC AAC CAT TCA CCC AAA CCC CAO CTA 570 
Tyr Clu Cly Ser Atp Phe Cln Cys Lys Asp Ser Pro Lys Ala Gin Leu 

105 110 115 

CCC AGO ACA ATA CAA TCT TCT CCG ACC AAT TTC TCC AAC CAC TAT TTG 618 
Arg Arg Thr He Clu Cyt Cys Arg Thr Asn Leu Cys Asa Cln Tyr Leu 
120 125 130 

CAG CCT ACA CTG CCC CCT CTT CTT ATA COT CCC TTC TTT CAT CCC ACC €66 
Gin Pro Thr Leu Pro Pro Val Val He Cly Pro Phe Phe Asp Cly Ser 
135 140 145 150 

ATC CGA TGG CTG CTT CTG CTC ATT TCC ATG CCT CTC TGT ATA CTT CCT 714 
He Arg Trp Lsu Val Val Leu He Ser Met Ala Val Cys He Val Ala 
155 160 165 

ATG ATC ATC TTC TCC ACC TGC TTT TCC TAT AAC CAT TAT TCT AAC AGT 762 
Met He He Phe Ser Ser Cys Phe Cys Tyr Lys His Tyr Cys Lys Ser 

170 175 180 

ATC TCA AGC AGG GGT CGT TAC AAC CCT CAT TTG CAA CAG CAT CAA CCA 810 
He Ser Ser Arg Gly Arg Tyr Asn Arg Asp Leu Glu Gin Asp Clu Ale 
185 190 195 

TTT ATT CCA CTA CGA CAA TCA TTG AAA GAC CTG ATT GAC CAG TCC CAA 858 
Phe He Pro Val Gly Glu Ser Leu Lys Asp Leu He Asp Gin Ser Gin 
200 205 210 

AGC TCT GGG AGT GCA TCT CGA TTG CCT TTA TTG CTT CAG CGA ACT ATT 906 
Ser Ser Gly Ser Gly Ser Gly Leu Pro Leu Leu Val Gin Arg Thr He 
215 220 225 230 

GCC AAA CAG ATT CAG ATG CTT CGC CAG CTT CGT AAA CGC CGC TAT GGA 954 
Ala Lys Gin He Gin Met Val Arg Cln Val Gly Lys Cly Arg Tyr Cly 
235 240 245 

CAA GTA TGG ATC GGT AAA TGG CGT GGT CAA AAA GTG CCT CTC AAA CTC 1002 
Glu Val Trp Met Cly Lys Trp Arg Cly Clu Lys Val Ala Val Lys Val 
250 255 260 

TTT TTT ACC ACT CAA GAA CCT AGC TGG TTT ACA CAA ACA CAA ATC TAC 1050 
Phe Phe Thr Thr Clu Clu Ala Ser Trp Phe Arg Clu Thr Glu He Tyr 
265 270 275 

CAG ACG CTG TTA ATG CCT CAT CAA AAT ATA CTT CGT TTT ATA CCT CCA 1098 
Cln Thr Val Leu Met Arg His Clu Asn He Leu Gly Phe He Ala Ala 
280 285 290 

GAC ATT AAA GGC ACT GGT TCC TGG ACT CAG CTG TAT TTC ATT ACT CAT 1146 
Asp He Lys Cly Thr Cly Ser Trp Thr Cln Leu Tyr Leu He Thr Asp 
295 300 305 310 

TAC CAT GAA AAT CGA TCT CTC TAT GAC TTC CTG AAA TCT CCC ACA CTA 1194 
Tyr His Glu Asn Cly Ser Leu Tyr Asp Phe Leu Lys Cys Ala Thr Leu 
315 320 .325 

GAC ACC AGA GCC CTA CTC AAG TTA CCT TAT TCT CCT CCT TGT GGT CTC 1242 
Asp Thr Axg Ala Leu Leu Lys Leu Ala Tyr Ser Ala Ala Cys Cly Leu 
330 335 340 
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TCC CAC CTC CXC ACA CAA ATT TAT CCT ACC CAA GGG AAC CCT CCA ATT 1290 
Cys His Leu His Thr Clu He Tyr Cly Thr Cln Cly Lys Pro Ala He 
345 350 355 

OCT CAT CCA CAC CTC AAC ACC AAA AAC ATC CTT ATT AAG AAA AAT CGA 1338 
Ala His Arg Asp Leu Lys Ser Lys Asn lis Leu lis Lys Lys Asn Gly 

360 365 370 

AGT TCC TGT ATT CCT CAC CTC CCC CTA CCT CTT AAA TTC AAC ACT CAT 1386 
Ser Cys Cys lis Ala Asp Leu Gly Leu Ala Val Lys Phs Asn Ssr Asp 

375 380 385 390 

ACA AAT CAA CTT CAC ATA CCC TTC AAT ACC ACG CTC CCC ACC AAG CGC 1434 

Thr Asn Glu Val Asp lis Pro Leu Asn Thr Arg Val Gly Thr Lys Arg 
395 400 405 

TAC ATC CCT CCA CAA CTC CTC CAT CAA ACC CTC AAT AAA AAC CAT TTC 1482 
Tyr Met Ala Pro Clu Val Leu Asp Glu Ser Leu Asn Lys Asn His Phs 
410 415 420 

CAC CCC TAC ATC ATG CCT CAC ATC TAT ACC TTT CCT TTC ATC ATT TCC 1530 
Cln Pro Tyr lie Met Ala Asp He Tyr Ser Phe Gly Leu He He Trp 
425 430 - 435 

CAA ATG CCT CCT CCT TGT ATT ACA CCA CCA ATC CTC CAC CAA TAT CAA 1578 
Clu Met Ala Arg Arg Cys He Thr Cly Gly He Val Clu Clu Tyr Cln 
440 445 450 

TTA CCA TAT TAC AAC ATC CTG CCC ACT CAC CCA TCC TAT GAG GAC ATG 1626 
Leu Pro Tyr Tyr Asn Met Val Pro Ser Asp Pro Ser Tyr Glu Asp Met 
4S5 460 465 470 

CCT CAC CTT CTG TGT CTG AAA CCC TTG CGC CCA ATC CTG TCT AAC CGC 1674 
Arg Clu Val Val Cys Val Lys Arg Leu Arg Pro He Val Ser Asn Arg 
475 480 485 

TCC AAC AGC CAT CAA TGT CTT CCA CCA CTT TTC AAG CTA ATG TCA GAA 1722 
Trp Asn Ser Asp Glu Cys Leu Arg Ala Val Leu Lys Leu Met Ser Glu 
490 495 500 

TGT TCC CCC CAT AAT CCA CCC TCC ACA CTC ACA CCT TTC ACA ATC AAC 1770 
Cys Trp Ala His Asn Pro Ala Ser Arg Leu Thr Ala Leu Arg He Lys 
505 510 515 

AAG ACA CTT CCA AAA ATG CTT CAA TCC CAC CAT CTA AAG ATT 1812 

Lys Thr Leu Ala Lys Met Val Glu Ser Cln Asp Val Lys He 

520 525 530 

TGACAATTAA ACAATTTTCA CCCAGAATTT AGACTCCAAC AACTTCTTCA CCCAAGCAAT 1872 

CCCTCCCATT AGCATGCAAT ACCATGTTCA CTTCCTTTCC ACACTCCTTC CTCTACATCT 1932 

T CAC AGC CTG CTAACAGTAA ACCTTACCCT ACTCTACAGA ATACAAGATT GGAACTTGGA 1992 

ACTTCAAACA TGTCATTCTT TATATATGAC AC C I TT G T T T TAATGTGCCG TTTTTTTGTT 2052 



TGCTTTTTTT CTTTTCTT 



2070 
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(2) INFORMATION FOR SIQ ID HO: 14: 

(i) SEQUXNCI CHARACTERISTICS : 

(A) LENGTH: 532 amino acids 

(B) TYPE: aaino acid 
<D) TOPOLOGY: Untax 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID MO: 14: 

Mat Thr Cln Lau Tyr Thr Tyr Ha Arg Lau Lau Cly Ala Cys Leu Pha 

15 10 is 

Ha Ha Ser His Val Cln Cly Cln Asn Lau Asp Sar Mat Lau Bis Cly 

20 25 30 

Thr Cly Mat Lys Sar Asp Lau Asp Cln Lys Lys Pro Clu Asn Cly Val 
35 40 45 

Thr Lau Ala Pro Clu Atp Thr Lau Pro Pha Lau Lys Cys Tvr Cvs Sar 

50 55 60 

Cly His Cys Pro Asp Asp Ala 11a Asn Asn Thr Cys He Thr Asn Cly 
65 70 75 80 

His Cys Phe Ala He lis Clu Clu Asp Asp Gin Gly Clu Thr Thr Lau 
85 90 95 

Thr Ser Cly Cys Met Lys Tyr Glu Cly Ser Asp Phe Cln Cys Lys Asp 

100 105 HO 

Ser Pro Lys Ala Cln Lau Arg Arg Thr Ha Glu Cys Cys Arg Thr Asn 
115 120 125 

Lau Cys Asn Gin Tyr Lau Cln Pro Thr Lau Pro Pro Val Val Ha Civ 

130 135 140 

Pro Phe Phe Asp Cly Ser Ha Arg Trp Lau Val Val Leu He Ser Met 
"5 150 155 160 

Ala Val Cys He Val Ala Met He Ha Pha Sar Ser Cys Phe Cys Tvr 
165 170 175 

Lys His Tyr Cys Lys Ser He Ser Ser Arg Cly Arg Tyr Asn Arg Asp 

180 185 190 

Leu Glu Gin Asp Glu Ala Phe He Pro Val Cly Glu Ser Leu Lys Ast> 

195 200 205 

Lau He Asp Gin Ser Cln Ser Ser Cly Sar Cly Sar Cly Lau Pro Lau 

210 215 220 

Leu Val Gin Arg Thr He Ala Lys Cln Ha Cln Met Val Arg Cln Val 
225 230 235 240 

Gly Lys Gly Arg Tyr Gly Glu Val Trp Met Cly Lys Trp Arg Cly Glu 
245 250 255 
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Lys Val Ala Val Lys Val Pht Pht Thr Thr Clu Clu Alt Str Xrp Pht 

260 265 270 

Arg Clu Thr Clu lit Tyr Gin Thr Val Leu Ktt Arg Hit Glu Am lit 
275 2S0 285 

Leu Gly Pht lit Ala Ala Asp lit Lys Gly Thr Gly Str Trp Thr Gin 

290 295 300 

Leu Tyr Leu lit Thr Asp Tyr His Glu Asn Gly Str Ltu Tyr Asp Pht 

305 310 315 320 

Ltu Lys Cys Ala Thr Ltu Asp Thr Arg Ala Leu Ltu Lys Ltu Ala Tyr 

325 330 335 

Str Ala Ala Cys Gly Ltu Cys His Ltu Bis Thr Glu lit Tyr Gly Thr 
340 345 350 

Gin Gly Lys Pro Ala lit Ala His Arg Asp Ltu Lys Ser Lys Asn lit 

355 360 365 

Leu He Lys Lys Asn Gly Ser Cys Cys lit Ala Asp Ltu Gly Ltu Ala 

370 375 380 

Val Lys Pht Asn Ser Asp Thr Asn Glu Val Asp lit Pro Leu Asn Thr 
365 390 395 400 

Arg Val Gly Thr Lys Arg Tyr Ket Ala Pro Glu Val Ltu Asp Glu Ser 
405 410 415 

Ltu Asn Lys Asn His Pht Gin Pro Tyr lit Ktt Ala Asp lit Tyr Str 
420 425 430 

Pht Gly Leu He He Trp Glu Met Ala Arg Arg Cys Ha Thr Gly Gly 
435 440 445 

lit Val Glu Clu Tyr Gin Ltu Pro Tyr Tyr Asn Met Val Pro Str Asp 
450 455 460 

Pro Ser Tyr Glu Asp Met Arg Glu Val Val Cys Val Lys Arg Ltu Arg 
465 470 475 480 

Pro He Val Ser Asn Arg Trp Asn Ser Asp Glu Cys Leu Arg Ala Val 
485 490 495 

Ltu Lys Leu Met Ser Clu Cys Trp Ala His Asn Pro Ala Str Arg Ltu 

500 505 510 

Thr Ala Leu Arg He Lys Lys Thr Leu Ala Lys Met Val Glu Ser Gin 
515 520 525 

Asp Val Lys He 

530 

(2) INFORMATION JTOR SEQ ID HOt 15 1 

(i) SEQUENCE CHARACTERISTICS I 

(A) LENGTH: 2160 bast pairs 
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(8) TYPE: nucleic acid 

(C) STRANDEDHESS: unknown 

(D) TOPOLOGY i linear 



(ii) MOLECULE TYPE: cDKA 

(iii) HYPOTHETICAL: NO 

{iii) ANTI-SENSE: NO 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouse 

(ix) rEATORE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 10. .1524 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CGCGGTTAC ATG GCG GAG TOG CCC GGA CCC TCC TCC TTC TTC CCC CTT 48 
Met Ala Clu Ser Ale cly Ale Ser Ser Phe Phe Pro I*u 
15 10 

GTT GTC CTC CTG CTC GCC GGC AGO GGC GGG TCC GOG CCC CGG GGG ATC 96 
Val Val Leu Leu Leu Ala Gly Ser Gly Gly Ser Gly Pro Arg Gly lie 
15 20 25 

CAG GCT CTG CTG TOT GCC TCC ACC AGC TGC CTA CAG ACC AAC TAC ACC 144 
Gin Ala Leu Leu Cye Ala Cya Thr Ser Cy« Leu Cln Thr Aen Tyr Thr 
30 35 40 45 

TGT GAG ACA CAT GGG GCT TGC ATG GTC TCC ATC TTT AAC CTC CAT GGC 192 
Cye Clu Thr Aep Gly Ala Cye Met Val Ser He Phe Aen Leu Aep Cly 
50 S5 60 

GTG GAG CAC CAT CTA CGT ACC TCC ATC CCC AAC CTG CAG CTG CTT CCT 240 
Val Clu Hie Hie Val Arg Thr Cye He Pro Lye Val Clu Leu Val Pro 
65 70 75 

CCT GGA AAG CCC TTC TAC TGC CTG ACT TCA CAC CAT CTC CCC AAC ACA 288 
Ala Gly Lye Pro Phe Tyr Cye Leu Ser Ser Clu Aep Leu Arg Aen Thr 
80 85 90 

CAC TGC TGC TAT ATT CAC TTC TGC AAC AAG ATT CAC CTC AGO CTC CCC 336 
Hie Cye Cye Tyr He Asp Phe Cye Aen Lye He Aep Leu Arg Val Pro 
95 100 105 

AGC GGA CAC CTC AAC CAG CCT CCC CAC CCC TCC ATG TOG GGC CCT CTC 384 
Ser Gly Hie Leu Lye Clu Pro Ala Hie Pro Ser Met Trp Gly Pro Val 
HO 115 120 125 

GAG CTC CTC GGC ATC ATC CCC CGC CCC CTC TTC CTC CTC TTC CTT ATC 432 
Glu Leu Val Gly He He Ala Cly Pro Val Phe Leu Leu Phe Leu lie 
130 135 140 
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720 



768 



ATT ATC ATC CTC TTC CTC CTC ATC AAC TXT CXC CXC CCT CTC TXC CAT 480 
He 21e 21a Val Ph. Leu Val 21. Asn Tyr Hit Oln Xrg Val Tyr Hi. 
1*5 150 * 155 

AAC CGC CXC AGO TTC CXC XTC CXC CXC CCC TCT TCC CXS ATC TCT CTC 528 
A.n Arg Cln Arg Leu Asp Met Clu A.p Pro S.r Cy. Clu Mat Cy. Leu 

160 165 170 

TCC AAA CAC AAG ACC CTC CAC CAT CTC CTC TAC CXC CTC TCC ACC TCA S7fi 
Ser Ly. A.p Ly. Thr Leu Cln A.p Leu Val Tyr A.p Leu s.r Thr S.r 

175 180 185 

COG TCT CGC TCA CGC TTA CCC CTT TTT CTC CAC CCC ACA CTC CCC CCA 624 
Cly Ser Gly Ser Cly Leu Pro Leu Pha Val Gin Arg Thr V*l Ala Arg 
190 195 200 205 

ACC ATT CTT TTA CXA CXG XTT ATC GGC AAC CGC CCC TTC CGC CAA CTA 672 
Thr lie Val Leu Gin Clu He He Cly Ly. Gly Arg Ph. Gly Clu Val 
210 215 220 

TGG CGT GGT CGC TCC AGO CCT GGT GXC CTG CCT CTC XXX XTC TTC TCT 
Trp Arg Gly Arg Trp Arg Gly Cly Aap Val Ala Val Ly. II. Ph. Ser 
225 230 235 

TCT CCT CAA CAX CCC TCT TGG TTC CGT CAA CCA GAG ATC TAC GAG ACC 
Ser Arg Glu Clu Arg Ser Trp Phe Arg Glu Ala Clu He Tyr Gin Thr 
240 245 2S0 

GTC ATG CTG CCC CAT CAA AAC ATC CTT CGC TTT ATT CCT CCT CAC AAT 816 
Val Met Leu Arg Hi. Clu Asn He Leu Gly Phe He Ala Ala A.p A.n 
255 260 265 

AAA GAT AAT GCC ACC TCC ACC CAG CTC TCC CTT GTC TCT GAC TAT CAC 864 
Ly. A.p A.n Gly Thr Trp Thr Cln Leu Trp Leu Val Ser A.p Tyr Hi. 
270 275 280 285 

GAG CAT GGC TCA CTG TTT CAT TAT CTG AAC CCC TAC ACA CTC ACC ATT 912 
Glu Hi. Gly ser Leu Phe A.p Tyr Leu A.n Arg Tyr Thr Val Thr Ha 
290 295 300 

GAG GGA ATG ATT AAG CTA GCC TTG TCT CCA CCC ACT CGT TTG CCA CAC 9 SO 

Glu Cly Met He Ly. Leu Ala Leu Ser Ala Ala Ser Cly Leu Ala Hi. 
305 310 315 

CTC CAT ATG GAG ATT CTG GGC ACT CAA GGC AAG CCG GGA ATT CCT CAT 1008 
Leu Hi. Met Clu He Val Cly Thr Cln Cly Ly. Pro cly 21. Ala Hi. 
320 325 330 

CGA GAC TTC AAG TCA AAG AAC ATC CTC GTC AAA AAA AAT CGC ATC TCT 1056 
Arg A.p Leu Ly. Ser Ly. A.n 21e Leu Val Ly. Ly. A.n Cly M.t Cy. 
335 340 345 

CCC ATT GCA GAC CTG CGC CTG CCT CTC CGT CAT GAT CCC CTC ACT CAC 1104 
Ala He Ala A.p Leu Cly Leu Ala Val Arg Hi. Asp Ala Val Thr Aap 
350 355 360 365 

ACC ATA GAC ATT CCT CCA AAT CAC ACC CTC CGC ACC AAA CGA TAC ATC 1152 
Thr He A.p 21. Ala Pro A.n Gin Arg Val Cly Thr Ly. Arg Tyr Met 
370 375 380 
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OCT CCT GAA GTC CTT CAC GAG ACA ATC AAC ATG AAC CAC TXT OAC TCC 1200 
Ala Pro Clu Val Lau Asp Glu Thr Xla Asn Kat Lya Hit Pha Asp Sir 
38S 390 395 

TTC AAA TGT CCC GAC ATC TAT OCC CTC OGO CTT GTC TAC TOO GAO ATT 1248 
Phm Lys Cys Ala Asp I la Tyr Ala Lau Gly Leu Val Tyr Trp Glu Xla 
400 405 410 

GCA CGA AGA TCC AAT TCT GGA CGA CTC CAT GAA CAC TAT CAA CTC CCC 1296 
Ala Arg Arg Cy» Asn £ar Gly Gly Val Bis Glu Asp Tyr Gin Lau Pro 
415 420 425 

TAT TAC CAC TTA CTC CCC TCC CAC CCT TCC ATT GAG GAG ATG CCA AAG 1344 
Tyr Tyr Asp Lau V*l Pro Sar Asp Pro Sar Xla Glu Glu Kat Arg Lys 
430 435 440 445 

CTT CTA TCT CAC CAG AAG CTA CGG CCC AAT GTC CCC AAC TCC TCC CAG 1392 
Val Val Cys Asp Gin Lys Lau Arg Pro Asn Val Pro Asn Trp Trp Gin 
450 455 460 

ACT TAT CAC CCC TTC CCA GTC ATC CCA AAC ATC ATC CCC CAC TCC TCC 1440 
Ser Tyr Glu Ala Lau Arg Val Mat Gly Lys Kat Mat Arg Glu Cys Trp 
465 470 475 

TAC GCC AAT GCT CCT GCC CCT CTC ACA CCT CTC CCC ATC AAG AAC ACT 1488 
Tyr Ala Asn Gly Ala Ala Arg Lau Thr Ala Lau Arg Xla Lys Lys Thr 
480 485 490 



CTC TCC CAC CTA ACC GTC CAC CAA GAT CTC AAC ATT TAACCTCTTC 1534 
Lau Ser Gin Lau Sar Val Gin Glu Asp Val Lys Xla 
495 500 505 



CTCTGCCTAC 


ACAAACAACC 


TCCCCACTCA 


GCATGACTCC 


ACCCACCCTC 


CAACCGTCCT 


1594 


CCACCCCTAT 


CCTCTTCTTT 


CTCCCCCCCC 


CTCTCCCACA 


GCCCTCCCCT 


CCAAGACCCA 


1654 


CACAGCCTCC 


GACACCCCCG 


CACTCCCCTT 


GGGTTTGACA 


CAGACACTTT 


TTATATTTAC 


1714 


CTCCTGATCG 


CATCCACACC 


TCACCAAATC 


ATGTACTCAC 


TCAATCCCAC 


AACTCAAACT 


1774 


CCTTCACTCG 


CAAGTACAGA 


GACCCACTCC 


ATTCCCTCTG 


CACGACCCTG 


ACCTCCTCCG 


1834 


CTCGCCACCA 


CCCCCCCCCA 


TACCTTGTGG 


TCCACTCCCC 


TCCACGTTTT 


CCTCCACGGA 


1894 


CCACTCAACT 


CCCATCAAGA 


TATTGACAGG 


AACCCCAACT 


TTCTCCCTCC 


TTCCCGTACC 


1954 


AGTCCTCACC 


CACACCATCC 


TTCTCATCGA 


CATCCCCACC 


ACTCCCCCTA 


GACACACAAC 


2014 


CTCCTGCCTC 


TCTCTCCAGC 


CAAGTCCCCA 


TCTCCCGACC 


TCTGTCCCAC 


ATTCTGCCTC 


2074 


CTCTGTGCCA 


CGCCCGTGTG 


TCTCTGTCTG 


TGTGTCAGTG 


ACTGTCTCTG 


TCTACACTTA 


2134 


ACCTCCTTGA 


GCTTCTCTGC 


ATC TCT 








2160 



(2) INFORMATION FOR SIQ ID HO: 16: 



<i) SEQUENCE CHARACTERISTICS t 
(A) LENGTH: 505 amino acids 
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(B) TTTZs aalno acid 
(0) TOPOLOGY! linear 

(ii) MOLECULE TTPE: pro t tin 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 1 

Met Ala Glu Ser Ala Cly Ala Ser 6ar Pha Pha Pro Leu Val Val Leu 

1 5 10 15 

Leu Leu Ala Gly Ser Cly Gly Sar Gly Pro Arg Gly Xla Gin Ala Lau 

20 25 30 

Leu Cya Ala Cya Thr Sar Cys Lau Gin Thr Aan Tyr Thr Cya Glu Thr 
35 40 45 

Aap Gly Ala Cya Mat Val Sar I la Pha Aan Lau Aap Gly Val Glu Bia 

50 55 €0 

Hia Val Arg Thr Cya lie Pro Lys Val Glu Lau Val Pro Ala Gly Lya 

65 70 75 80 

Pro Pha Tyr Cya Lau Sar Sar Glu Aap Leu Arg Asn Thr Kia Cyt Cys 

85 90 95 

Tyr lie Asp Pha Cya Asn Lya I la Aap Leu Arg Val Pro Sar Gly Bia 

100 105 110 

Leu Lya Glu Pro Ala Hit Pro Ser Met Trp Gly Pro Val Glu Leu Val 
115 120 125 

Gly lie lie Ala Gly Pro Val Pha Leu Leu Phe Leu lie lie lie He 

130 135 140 

Val Phe Leu Val He Asn Tyr Bis Gin Arg Val Tyr Bit Am Arg Gin 
145 150 155 160 

Arg Leu Asp Mat Glu Asp Pro Sar Cys Glu Ket Cya Leu Ser Lys Asp 
165 170 175 

Lys Thr Leu Gin Asp Lau Val Tyr Asp Leu Ser Thr Ser Gly Ser Gly 
180 185 190 

Ser Gly Leu Pro Leu Pht Val Gin Arg Thr Val Ala Arg Thr He Val 
195 200 205 

Leu Gin Glu He He Cly Lys Gly Arg Phe Gly Glu Val Trp Arg -Cly 
210 215 220 

Arg Trp Arg Gly Gly Asp Val Ala Val Lya He Phe Ser Ser Arg Glu 
225 230 235 240 

Glu Arg Ser Trp Phe Arg Glu Ala Glu He Tyr Gin Thr Val Bet Leu 
245 250 255 

Arg Bia Glu Aan He Leu Gly Phe He Ala Ala Asp Asn Lys Asp Aan 

260 265 270 
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Cly Thr Trp Thr Cln Leu Trp Leu Vel Ser Aep Tyr His Olu His Gly 

275 280 285 

Ser Leu Phe A«p Tyr Leu Asn Arg Tyr Thr Val Thr He Glu Cly Met 

290 295 300 

lie Lye Leu Ale Leu Ser Ale Ale Ser Gly Leu Ale Hie Leu Hie Met 

305 310 315 320 

Glu lie Vel Cly Thr Gin Gly Lye Pro Gly Zle Ale Hie Arg Aep Leu 

325 330 335 

Lye Ser Lye Aen lie Leu Val Lye Lye Aen Gly Met Cye Ale lie Ale 

340 345 350 

Asp Leu Gly Leu Ale Vel Arg Hie Aep Ale Vel Thr Asp Thr lie Aep 
355 360 365 

Zle Ale Pro Aen Gin Arg Vel Gly Thr Lye Arg Tyr Met Ale Pro Glu 

370 375 380 

Val Leu Asp Glu Thr lie Aen Met Lye Hie Phe Asp Ser Phe Lye Cye 
385 390 395 400 

Ala Asp He Tyr Ala Leu Gly Leu Val Tyr Trp Glu He Ale Arg Arg 
405 410 415 

Cye Aen Ser Gly Gly Val His Glu Asp Tyr Cln Leu Pro Tyr Tyr Aep 
420 425 430 

Leu Val Pro Ser Asp Pro Ser He Glu Glu Met Arg Lye Val Val Cye 
435 440 445 

Asp Cln Lys Leu Arg Pro Aen Val Pro Aen Trp Trp Gin Ser Tyr Glu 
450 455 460 

Ala Leu Arg Val Met Gly Lys Met Met Arg Glu Cye Trp Tyr Ale Aen 
465 470 475 480 

Gly Ala Ala Arg Leu Thr Ala Leu Arg He Lye Lye Thr Leu Ser Gin 
485 490 495 

Leu Ser Val Gin Clu Asp Val Lye He 
500 505 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1952 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS t unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: CDNA 
<iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
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(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM! Mouse 

(ix) 7ZAT7K2: 

(A) NAME/KEY: CDS 

(B) LOCATION : 187. •1692 



<xi) SEQUENCE DESCRIPTION! SEQ ID NO: 17: 

AAGCCGCGGC AGAAGTTGCC CGCCTGCTGC TCGTAGTGAG GGCGCGGAGG ACCCGGCACC 60 

TGGGAAGCGG CGGCGGGTTA ACTTCGGCTG AATCACAACC ATTTGGCGCT CAGCTATGAC 120 

AAGAGAGCAA ACAAAAAGTT AAAGGAGCAA CCOGGCCATA AGTCAAGAGA GAAGTTTATT 180 

GATAAC ATC CTC TTA CGA AGC TCT CCA AAA TTA AAT CTG GGC ACC AAG 228 
Met Leu Leu Arg Ser Ser Cly Lye Leu Aen Val Gly Thr Lye 

1 5 10 

AAG GAG GAT GGA GAG ACT ACA GCC CCC ACC CCT CGC CCC AAG ATC CTA 276 
Lye Clu Asp Gly Glu Ser Thr Ala Pro Thr Pro Arg Pro Lye He Leu 
15 20 25 30 

CCT TCT AAA TCC CAC CAC CAC TCT CCC GAA CAC TCA GTC AAC AAT ATC 324 
Arg Cys Lys Cye Hie Hie Hie Cye Pro Clu Aep Ser Val Aen Aen He 
35 40 45 

TCC AGC ACA GAT GGG TAC TGC TTC ACG ATC ATA GAA GAA CAT GAC TCT 372 
Cys Ser Thr ABp Gly Tyr Cye Phe Thr Met He Glu Clu Aep Asp Ser 
50 55 60 

CGA ATC CCT GTT CTC ACC TCT CCA TCT CTA CCA CTA GAA GGG TCA CAT 420 
Gly Met Pro Val Val Thr Ser Gly Cye Leu Gly Leu Glu Cly Ser Aep 
65 70 75 

TTT CAA TGT CCT GAC ACT CCC ATT CCT CAT CAA ACA AGA TCA ATT GAA 468 
Phe Gin eye Arg Asp Thr Pro He Pro Hie Gin Arg Arg Ser He Clu 
80 85 90 

TGC TGC ACA CAA AGG AAT CAC TCT AAT AAA CAC CTC CAC CCC ACT CTC 516 
Cy» Cye Thr Glu Arg Aen Glu Cye Aen Lye Aep Leu His Pro Thr Leu 
95 100 105 no 

CCT CCT CTC AAG GAC AGA GAT TTT GTT CAT GGG CCC ATA CAC CAC AAG 564 
Pro Pro Leu Lys Asp Arg Asp Phe Val Asp Gly Pro He Hie Hie Lye 
115 120 125 

GCC TTG CTT ATC TCT CTG ACT CTC TCT AGT TTA CTC TTC CTC CTC ATT 612 
Ala Leu Leu He Ser Val Thr Val Cye Ser Leu Leu Leu Val Leu He 
130 135 140 

ATT TTA TTC TCT TAC TTC AGG TAT AAA AGA CAA CAA CCC CGA CCT CGC 660 
He Leu Phe Cys Tyr Phe Arg Tyr Lye Arg Gin Clu Ala Arg Pro Aro 
145 150 155 
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TAG ACC ATT COG CTO GAC CAC CAC CAO ACA TAG ATT CCT CCT CCA CAC 708 
Tyr Ser He Cly Leu clu Gin Aap Clu Thr Tyr lie Fro Fro Gly Clu 
160 165 170 

TCC CTG ACA GAC TTC ATC CAC CAC TCT CAC ACC TCC CCA ACT CCA TCA 756 
Ser Leu Arg Atp Leu lie Glu Gin Sir Gin St Sor Cly Sir Gly Ser 
175 180 185 190 

CCC CTG CCT CTG CTG GTC CAA ACG ACA ATA CCT AAG CAA ATT CAG ATG 804 
Gly Leu Fro Leu Leu Val Gin Arg Thr !!• Ala. Lya Gin He Gin Met 
195 200 205 

CTG AAC CAC ATT CCA AAA CCC CCC TAT CCC CAG CTO TCC ATC CCA AAG 652 
Val Lya Gin lie Gly Lye Gly Arg Tyr Cly Glu Val Trp Met Gly Lya 
210 215 220 

TCG CGT GGA CAA AAC CTG CCT CTG AAA CTG TTC TTC ACC ACG GAG CAA 900 
Trp Arg Cly Clu Lya Val Ala Val Lya Val Fhe Fha Thr Thr Clu Glu 
225 230 235 

CCC ACC TGC TTC CCA CAG ACT CAC ATA TAT CAC ACC CTC CTG ATG CCC 948 
Ala Ser Trp Fha Arg Clu Thr Glu Ila Tyr Cln Thr Val Leu Mat Arg 
240 245 250 

CAT CAG AAT ATT CTG GGG TTC ATT CCT CCA GAT ATC AAA CGG ACT GGG 996 
Hi* Glu Aan Ila Leu Gly Fha Ila Ala Ala Aap He Lye Gly Thr Gly 
255 260 265 270 

TCC TGC ACT CAC TTG TAC CTC ATC ACA CAC TAT CAT CAA AAC GCC TCC 1044 
Ser Trp Thr Gin Leu Tyr Leu Ila Thr Aap Tyr Els Glu Aan Cly Ser 
275 280 285 

CTT TAT GAC TAT CTG AAA TCC ACC ACC TTA CAC CCA AAG TCC ATG CTG 1092 
Leu Tyr Aap Tyr Leu Lya Ser Thr Thr Leu Asp Ala Lye Ser Met Leu 
290 295 300 

AAG CTA GCC TAC TCC TCT CTC ACC CCC CTA TCC CAT TTA CAC ACG CAA 1140 
Lya Leu Ala Tyr Ser Ser Val Ser Gly Leu Cya Hie Leu Hie Thr Glu 
305 310 315 

ATC TTT AGC ACT CAA CGC AAG CCA CCA ATC CCC CAT CGA CAC TTC AAA 1188 

He Phe Ser Thr Gin Gly Lya Fro Ala He Ala Hie Arg Aap Leu Lya 

320 325 330 

ACT AAA AAC ATC CTC GTC AAG AAA AAT CCA ACT TCC TGC ATA CCA GAC 1236 
Ser Lya Aan He Leu Val Lya Lya Aan Cly Thr Cya Cya Ha Ala Aap 
335 340 345 350 

CTG GCC TTG CCT GTC AAG TTC ATT ACT CAC ACA AAT GAG GTT CAC ATC 1284 
Leu Gly Leu Ala Val Lya Fha He Ser Aap Thr Aen Glu Val Aap He 
355 360 365 

CCA CCC AAC ACC CGG CTT GCC ACC AAG CGC TAT ATC CCT CCA CAA CTG 1332 
Pro Fro Aan Thr Arg Val Cly Thr Lya Arg Tyr Met Fro Fro Glu Val 
370 375 380 

CTG GAC GAG AGC TTG AAT AGA AAC CAT TTC CAO TCC TAC ATT ATG CCT 1380 
Leu Asp Glu Ser Leu Aan Arg Aan Bia Fha Gin Ser Tyr He Met Ala 
385 390 395 
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GXC ATG TAC AGC TTT CCA CTC ATC CTC TCC GAG ATT CCA AGO ACA TOT 1428 
Aep Met Tyr Ser Phe Cly Leu !!• Leu Trp Glu II* Ala Arg Arg Cye 
400 405 410 

GTT TCT GGA CGT ATA CTG CAA CAA TAC CAG CTT CCC TAT CAC CAC CTG 1476 
Val Ser Gly Gly lit Val Glu Clu Tyr Gin Leu Pro Tyr Hii Aep Leu 
415 420 425 430 

GTG CCC AGT GAC CCT TCT TAT GAG CAC ATC AGA GAA ATT CTC TCC ATC 1524 
Val Pro Ser A«p Pro Ser Tyr Clu Asp Met Arg Glu lie Val Cye Met 
435 440 445 

AAG AAG TTA COG CCT TCA TTC CCC AAT CGA TGG AGC AGT CAT CAG TGT 1572 
Lye Lye Leu Arg Pro Ser Phe Pro Aan Arg Trp Ser Ser Aep Glu Cye 
450 455 460 

CTC AGO CAG ATG GGC AAG CTT ATC ACA CAG TGC TGG CCG CAG AAT CCT 1620 
Leu Arg Gin Met Gly Lya Leu Met Thr Glu Cye Trp Ale Cln Aen Pro 
465 470 475 

GCC TCC AGG CTC ACG CCC CTG AGA CTT AAG AAA ACC CTT GCC AAA ATG 1668 
Ale Ser Arg Leu Thr Ale Leu Arg Val Lye Lye Thr Leu Ale Lye Met 
480 485 490 

TCA CAG TCC CAG CAC ATT AAA CTC TGACCTCAGA TACTTGTGCA CAGAGCAAGA 1722 
ser Glu Ser Gin Aip lie Lye Leu 
495 500 

ATTTCACAGA AGCATCGTTA GCCCAAGCCT TGAACGTTAG CCTACTGCCC AGTGAGTTCA 1782 

GACTTTCCTG CAAGAGAGCA CGGTGGGCAG ACACAGAGGA ACCCAGAAAC ACGCATTCAT 1842 

CATGCCTTTC TGAGCAGGAG AAACTGTTTC GGTAACTTGT TCAACATATG ATGCATCTTG 1902 

CTTTCTAAGA AACCCCTGTA TTTTGAATTA CCATTTTTTT ATAAAAAAAA 1952 

(2) INFORMATION FOR SEQ ID HO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 502 amino ecida 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 18: 

Met Leu Leu Arg Ser Ser Cly Lye Leu Aen Val Cly Thr Lye Lye Clu 
1 5 10 15 

Asp Gly Glu Ser Thr Ala Pro Thr Pro Arg Pro Lye He Leu Arg Cye 
20 25 30 

Lve Cye Hia Hie Hie Cye Pro Glu Aep Ser Val Aan Asn He Cye Ser 
35 40 45 

Thr Asp Gly Tyr eye Phe Thr Met He Clu Clu Aap A»p Ser Cly Met 
50 55 60 
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Pro Val Val Thr Sar Gly Cya Leu Gly Lau Glu Gly Sar Aap Pha Gin 

65 70 75 80 

Cya Arg Aap Thr Pro I la Pro Hia Gin Arg Arg Sar Ila Glu Cya Cya 
85 90 95 

Thr Glu Arg Aan Glu Cya Aan Lya Aap Lau Hia Pro Thr Lau Pro Pro 
100 105 110 

Lau Lya Aap Arg Aap Pha Val Aap Gly Pro Ila Hia Hia Lya Ala Lau 

115 120 125 

Lau Zla Sar Val Thr Val Cya Sar Lau Lau Lau Val Lau Ila Ila Lau 

130 135 140 

Pha Cya Tyr Pha Arg Tyr Lya Arg Gin Glu Ala Arg Pro Arg Tyr Sar 
145 150 155 160 

Ila Gly Lau Glu Gin Asp Glu Thr Tyr Ila Pro Pro Gly Glu Sar Lau 
165 170 175 

Arg Aap Leu Ila Glu Gin Ser Gin Sar Sar Gly Ser Gly Sar Gly Lau 
180 185 190 

Pro Leu Lau Val Gin Arg Thr Ila Ala Lya Gin Ila Gin Mat Val Lya 

195 200 205 

Gin Ila Gly Lya Gly Arg Tyr Gly Glu Val Trp Mat Gly Lya Trp Arg 
210 215 220 

Gly Glu Lya Val Ala Val Lys Val Pha Pha Thr Thr Glu Glu Ala Sar 
225 230 235 240 

Trp Pha Arg Glu Thr Glu Ila Tyr Gin Thr Val Lau Mat Arg Hia Glu 
245 250 255 

Aan lie Leu Gly Pha Ila Ala Ala Aap Ila Lya Gly Thr Gly Sar Trp 
260 265 270 

Thr Gin Leu Tyr Lau Ila Thr Aap Tyr Hia Glu Am Gly Sar Lau Tyr 
275 280 285 

Asp Tyr Leu Lya Ser Thr Thr Lau Aap Ala Lya Sar Mat Lau Lya Leu 

290 295 300 

Ala Tyr Ser Ser Val Ser Gly Leu Cya Hia Leu Hia Thr Glu He Pha 
305 310 315 320 

Ser Thr Gin Gly Lya Pro Ala Ila Ala Hia Arg Asp Lau Lya Ser Lya 
325 330 335 

Ann He Leu Val Lya Lya Aan Gly Thr Cya Cya Ila Ala Aap Lau Gly 
340 345 350 

Leu Ala Val Lya Pha Ila Ser Aap Thr Aan Glu Val Aap Ila Pro Pre 

355 360 365 

Aan Thr Arg Val Gly Thr Lya Arg Tyr Mat Pro Pro Glu Val Lau Aap 

370 375 380 
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Glu Ser Leu Am Arg Asn Hit Phi Gin Ser Tyr lie Met Alt Asp Met 

385 3SO 395 400 

Tyr Ser Phe Gly Leu He Leu Trp Glu Zle Ale Arg Arg Cye Val Ser 
405 410 415 

Gly Gly He Val Glu Glu Tyr Gin Leu Pro Tyr Bis Atp Leu Val Pro 
420 425 430 

Ser Asp Pro Ser Tyr Glu Asp Met Arg Glu Zle Val Cye Met Lye Lye 
435 440 445 

Leu Arg Pro Ser Phe Pro Asn Arg Trp Ser Ser Aep Glu Cye Leu Arg 
450 455 460 

Gin Met Gly Lys Leu Met Thr Glu Cys Trp Ala Gin Asn Pro Ale Ser 
465 470 475 480 

Arg Leu Thr Ala Leu Arg Val Lye Lys Thr Leu Ala Lye Met Ser Glu 
485 490 495 

Ser Gin Asp Zle Lys Leu 

500 

(2) INFORMATION FOR SEQ ZD NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 basa pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ZD NO: 19: 
GCGGATCCTG TTGTGAAGCN AATATGTG 28 



(2) INFORMATION FOR SEQ ZD NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NOt 20: 
GCGATCOGTC GCAGTCAAAA TTTT 24 



(2) INFORMATION POR SEQ ID HO: 21s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : Sing Is 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(lii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GCGCATCCGC CATATATTAA AAGCAA 26 



(2) INFORKATION POR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 bass pairs 

(B) TYPE: nuclsic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linsar 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CGGAATTCTG GTCCCATATA 20 



(2) INFORKATION POR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 37 bass pairs 
(8) TYPEt nuclsic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linsar 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
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(ill) ANTI-SENS*! HO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO I 23: 
ATTCAAGGGC ACATCAACTT CATTTCTCTC ACTCTTG 

(2) INFORMATION FOR SEQ 10 NO: 24s 

(i) SEQUENCE CHARACTERISTICS X 

(A) LENGTH: 26 basa pairs 

(B) TYPE: nuclaic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: cDKA 
(iii) HYPOTHETICAL: NO 
{iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
CCGGATCCAC CATGGCGCAG TCGCCC 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 basa pairs 

(B) TYPE: nuclaic acid 

(C) STRANDED NESS : aingla 

(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
AACACCGGCC CCGCGATGAT 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linsar 

(ii) MOLECULE TYPE: peptida 
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(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 26: 

Gly Xaa Cly Xaa Xaa Cly 

1 5 

(2) INFORMATION FOR SEQ ID NO: 27 : 

(i) SEQUENCE CHARACTERISTICS I 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE i paptida 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27* 

Asp Phe Lya Sar Arg Asn 

1 5 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: p«ptid« 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Asp Leu Lys Ser Lys Aan 

1 5 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acida 

(B) TYPE: amino acid 
(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: paptida 

(xi) SEQUENCE DESCRIPTION: SEQ ID KOt 29: 

Cly Thr Lya Arg Tyr Mat 
1 5 
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